Systems and methods for a simulation program of a percolation model for the loss distribution caused by a cyber attack

ABSTRACT

Aspects of the present disclosure relate to implementation of a dynamical structural percolation model, that may be implemented via at least one computing device, and models the loss distribution for cyber risk due to breach of IT networks of a single small or medium-sized enterprise, which is believed to be a technical improvement responsive to the foregoing issues and challenges with known risk/loss distribution models.

CROSS REFERENCE TO RELATED APPLICATIONS

This is a non-provisional application that claims benefit to U.S. provisional patent application Ser. No. 62/730,649 filed on Sep. 13, 2018, which is herein incorporated by reference in its entirety.

FIELD

The present disclosure generally relates to a model for loss distribution of cyber risk; and in particular, to systems and methods for a percolation model for the loss distribution caused by a cyberattack.

BACKGROUND

According to the European Solvency II Directive, operational risk, of which cyber risk is a part, is defined as the risk of loss arising from inadequate or failed internal processes, personnel or systems, or from external events. This definition, albeit presented in general terms, highlights financial loss as its key aspect. In contrast, in the current actuarial literature on cyber risk, the most frequently cited definition of cyber risk is given in the work of, which describes cyber risks as ‘operational risks to information and technology assets that have consequences affecting the confidentiality, availability, or integrity of information or information systems’. While this definition captures well the phenomenon of cyber risk as seen from a computer science perspective, it does not characterize any associated financial losses or risks, which are essential for the actuarial community. A more recent definition by attempts to reconcile the previous two by stating that ‘cyber risk means any risk of financial loss, disruption or damage to the reputation of an organization from some sort of failure of its information technology systems’. As the goal of this work is to consider only losses that occur as a result of actions from unauthorized breaches of IT systems, which constitute a substantial component of cyber risk, all of the previously given definitions remain too broad, since they encompass financial losses resulting both from external and internal sources under various motivations of the agents involved. Instead, we will precisely define the aspect of cyber risk which we consider and refer to as cyber risk due to breach in the following way: cyber risk due to breach is ‘the risk of a financial loss caused by a breach of an institution's IT infrastructure by unauthorized parties, resulting in exploitation, taking possession of, or disclosure of data assets’.

The McKinsey 2018 report reveals that, globally, companies invested $500 M in cyber-security in 2017, but have nonetheless incurred dramatic losses estimated at $400 B. The findings of the reports further underline the importance of cyber risk and its financial implications. Clearly, the existing situation creates a considerable opportunity for insurers to open new product lines and capitalize on this new risk type, while providing a useful insurance product to consumers. In order for this to be achieved in a sustainable manner, a proper pricing mechanism for this new type of risk has to be developed. Therefore, a better understanding of cyber risk and its pricing carries importance for both the academic and practitioner actuarial communities, as well as for the society at large.

Insurance risk pricing is methodologically driven largely by the existence of a loss distribution for a particular risk. Once that loss distribution is known, the price of an insurance product to protect against the risk whose materialization induced the loss distribution can be ascertained by means of well-established actuarial methods, and is dependent on use of appropriate risk measures. In practice, only after sufficient time has passed and enough data points have been collected, the empirical loss distribution is revealed, and, as such, can become a basis for pricing of insurance products.

It is with these observations in mind, among others, that various aspects of the present disclosure were conceived and developed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a simplified block diagram of a system or network for implementing the percolation model described herein.

FIGS. 2A-2F are histograms of loss of cyber risk due to breach random variable based on one million simulations. The Log-normally distributed costs are considered under the assumption of random tree LAN topology and two probabilities of edge contagion. The presented case is of high volatility of cost.

FIGS. 3A-3F are histograms of loss of cyber risk due to breach random variable based on one million simulations. The Log-normally distributed costs are considered under the assumption of random tree LAN topology and two probabilities of edge contagion. The presented case is of low volatility of cost.

FIGS. 4A-4B are illustrations showing how to count the number of self-avoiding paths when 0<2r<R on the left in FIG. 4A, and how to count the number of self-avoiding paths when 2r>R>r on the right in FIG. 4B.

FIG. 5 is an example schematic diagram of a computing device that may implement various methodologies of the percolation model described herein.

Corresponding reference characters indicate corresponding elements among the view of the drawings. The headings used in the figures do not limit the scope of the claims.

DETAILED DESCRIPTION

A computer-implemented framework or platform including associated methods for implementing a structural model of aggregate loss distribution for cyber risk is disclosed. In some embodiments, the structural model is formulated and implemented under the assumption of a tree-based LAN topology. In general, the problem of aggregate loss distribution may be contextualized in a probabilistic graph-theoretical framework using at least one percolation model. It may be assumed that the information technology (IT) network topology is represented by an appropriate random graph of finite or infinite size, allowing for heterogeneous loss topology, and providing instructive numerical examples.

Introduction and Technical Need/Challenges for Modeling Cyber Risk

Cyber risk due to breach can be seen as a risk of a financial loss due to breach of an institution's IT infrastructure by unauthorized parties and exploiting, taking possession of, or disclosing data assets, thus creating financial and/or reputation damage. In the case of cyber insurance, loss distributions of sufficient quality are lacking. In the academic literature so far, research in cyber risk relevant for actuarial science encompasses the application of various statistical methodologies on limited data sources. The data sources used in known studies suffer from a lack of granularity due to the underreporting of cyber-risk-relevant incidents and their types across various industries and sectors. Moreover, there is insufficient information about the losses caused by each of the reported attacks and, therefore, the connection between the number of breaches and the losses must be imputed ex-post. Some works look at cyber risk through the lenses of insurance economics and do not consider questions of loss distributions and pricing.

It is believed that any existing loss distribution models are inadequate largely due to choice of various processes such that the resulting analytical and computational complexity of these models precludes the direct use of in the context of medium sized companies or networks having for example more than fifty computers/vertices. In addition, these existing models often require mean-field approximations under very specific parameter choices, where curing rates are higher than infection rates, and demand higher-order approximations for a higher number of computers, resulting in an increasing computational burden. For all of these reasons, these models lend poorly to realities of medium to large computer networks. Moreover, it is believed that no theoretical models of an aggregate loss distribution for cyber risk exist in this setting, as further described herein.

Proposed Technical Improvement

Aspects of the present disclosure relate to implementation of a dynamical structural percolation model 101 (percolation model 101), that may be implemented via at least one computing device, and models the loss distribution for cyber risk due to breach of IT networks of a single small or medium-sized enterprise, which is believed to be a technical improvement responsive to the foregoing issues and challenges with known risk/loss distribution models. By way of further introduction, recall that a network that connects computers and other devices in a relatively small area, such as an office building, is called a Local Area Network (LAN). For a given LAN, the logical topology describes the arrangement of the devices on the LAN and the way in which they communicate with one another, whereas its physical topology refers to the geometric arrangement of the devices on the LAN network, such as computers, cables and other devices. Simply put, the logical layer of a LAN represents the arrangement of devices as data flowing through the network would perceive it, whereas the physical layer of a LAN is the arrangement in physical space of the computers, cables and other devices, in a way in which a human being would see it. The physical topology of a network can be of various types: bus, star, ring, circular, mesh, tree, hybrid, or daisy chain. Today, most networks are, in fact, of the hybrid type, that is, they are represented by a combination of several network types.

The ever-changing landscape of cyber risk threats across multiple sublayers of the logical network, such as the operating system and the applications layer, makes the modeling of various types of threats on the logical layer a mathematically daunting task subject at the whim of technological progress. Thus, the novel model described herein considers the physical layer of the LAN and assumes a tree network topology, as a composition of the star and bus networks, due to its scaling property and convenience for small and medium-sized enterprises. In addition, bond percolation may be used to introduce contagion process into the physical layer. By choosing bond percolation as a model of contagion, a view is taken which is believed to be relevant for actuaries and is consistent with actuarial modeling practice. For example, when considering loss due to fire damage, there is usually no need to model the fire process and the reaction by firefighters. Similarly, when considering losses due to mortality, actuaries do not model in detail the processes which led to the deaths of individuals or the medical counter-action. When transposed to the setting presented with the percolation model 101, this means that the burden placed on the contagion process is to reveal the computers on which losses due to breach have occurred, and not necessarily the reality of the struggles of IT staff to contain this process or the full reality of the process itself. To model the temporal dynamics of the network, that is, the fact that the network changes over time, random tree graphs may be used with the percolation model 101 as an underlying mathematical structure. It may also be assumed that an arrangement of data assets exists and that it follows the topology of the network, i.e., a data asset may be attached with a certain value to each node of the network. This arrangement in space of data asset values constitutes a cost topology. The infection of a node in the network entails the loss of the data asset and its value. To account for the dynamical nature of data assets across time and over the evolving network, it is assumed that data asset values are represented by random variables. The percolation model 101 then defines contagion process stemming from an event of a breach of a node given a particular temporal instance of network topology. Finally, the sum of all losses, given a particular node at which the infection starts and the realization of the associated contagion process, characterizes one observation point in the aggregate loss distribution due to breach.

Further within the present disclosure, analytical results are given related to the mean, variance, and tail behavior of aggregate losses resulting from cyber attacks due to breach. It is then emphasized that the analytical and numerical results hold for arbitrarily large random trees, however, since exclusively tree based LAN topology is not typically observable in big companies, the results are mostly applicable to small to medium-sized companies, thus to the dominant form of companies in an economy. Finally, the analytical results provided give an exact expression of the expectation of aggregate losses of cyber risk due to breach for the model itself, not just an approximation of the percolation model 101, and hold for all possible parameters (all possible infection rates and all possible number of computers), thus not just a subset of the parameters.

In contrast to the foregoing existing models, the percolation model 101 is taken from a substantially different conceptual view, starting from the way one perceives or looks at the computer network, and extending to the choice of modeling the contagion process and the costs that occur due to contagion. Various other advantages are further realized. Referring to FIG. 1, a system 100 is shown that may be used to implement the percolation model 101 described herein. In particular, aspects of the percolation model 101 may take the form of (or otherwise be implemented as) an application 102 implemented by at least one computing device 104, which may be a server, a computing system, or implemented as part of a cloud-computing environment. Using the application 102, the computing device 104 may be configured to execute or otherwise perform operations defined by the percolation model 101 and further described herein. In other words, functions or services provided by the model 101 and/or the application 102 may be implemented as code and/or machine-executable instructions executable by the computing device 104 that may represent one or more of a procedure, a function, a subprogram, a program, a routine, a subroutine, a module, a software package, a class, or any combination of instructions, data structures, or program statements, and the like. As such, embodiments of the application 102 and the model 101 described herein may be implemented by hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware, or microcode, the program code or code segments to perform the necessary tasks (e.g., a computer-program product) may be stored in a computer-readable or machine-readable medium, and a processor(s) associated with the computing device 104 may perform the tasks defined by the code.

As shown, the system 100 may further include a network topology 112 associated with a plurality of devices 114 (designated device 114A, device 114B, and device 114C) or nodes, any of which may be computing devices such as laptops or tablets, servers, and/or may include mobile devices such as smartphones, and the like. The topology 112 may be a LAN topology associated with a single small or medium-sized enterprise, and the percolation model 101 may be implemented to model the loss distribution for cyber risk due to breach of a network associated with the topology 112. As indicated, the computing device 104 may access data 116 associated with the topology 112 through any one of the devices 114 or other means.

As further shown, the system 100 may also include a client application 120 which may be configured to provide aspects of the percolation model 101 to any number of client devices 122 via a network 124, such as the Internet, a local area network, a cloud, and the like. In addition, data about network topologies, threats, or other related areas including metadata, and other forms of data may be stored within a database 130 accessible by the computing device 104.

Stochastic Modeling of Cyber Attacks

The mathematical framework of the percolation model 101 shall now be described to define or model loss due to cyber attacks. We introduce, and study both analytically and numerically, a stochastic model (L_(t)) that keeps track of the aggregate loss up to time t due to cyber attacks. For the purpose of pricing, the goal is to study the mean and variance of L_(t). Intuitively, the process (L_(t)) is a continuous-time Markov chain obtained from the combination of a Poisson process representing the times at which cyber attacks strike, a random graph representing the infrastructure network of the company at the time of the attack, and a percolation process on the random graph modeling contagion. More precisely, the process can be constructed using the following components:

-   -   A Poisson process (N_(t)) with intensity λ.     -   A random tree         _(R)=(V, E) constructed recursively in R steps (representing the         radius) from an integer-valued random variable X with         probability mass function P(X=k)=p_(k).     -   The global vulnerability of the network p∈(0, 1).     -   A collection of local costs C_(y)>0 attached to each vertex y of         the random tree.

The Process Evolves as Follows. At the Arrival Times T ^(i)=inf{t:N _(t) =i}

of the Poisson process, we let

^(i) _(R)=(V^(i), E^(i)) be a realization of the random tree that we think of as the infrastructure network of the company at time T^(i). To construct the tree, we draw X edges starting from a root called 0, meaning k edges with probability p_(k), and additional edges starting from each of the subsequent vertices using the same probability distribution. The construction stops after R steps so the tree has radius at most R. The trees generated at different times

¹ _(R)=(V ¹ ,E ¹),

² _(R)=(V ² ,E ²), . . . ,

¹ _(R)=(V ^(i) ,E ^(i)), . . . .

are independent and identically distributed. To model contagion on the ith tree, we let x^(i)∈V^(i) be a vertex chosen uniformly at random from the vertex set representing the starting point of the cyber attack, and define a bond percolation process on the tree by letting ξi(e)=Bernoulli (p) for all e∈E ^(i)

be independent Bernoulli random variables. Following the usual terminology of percolation theory, edges with ξi(e)=1 are said to be open, and we assume that the ith cyber attack results in all the vertices in the open cluster starting at x^(i),

^(i)(x ^(i))={y∈V _(i): there is a path of open edges connecting x ^(i) and y},

to be hacked. Then, letting for C_(y) ^(i) for y∈V^(i) be independent and identically distributed random variables representing the cost of each vertex, we define the size of the cyber attack S^(i) and the global loss C^(i) caused by the cyber attack as the number of vertices being hacked and the cumulative cost of all the vertices being hacked, respectively:

S i = card ⁢ ⁢ ( i ⁢ ( x i ) ) ⁢ ⁢ and ⁢ ⁢ C i = ∑ y ∈ i ⁢ ( x i ) ⁢ C y i .

Finally, the random variable L_(t) is defined as the aggregate loss caused by all the cyber attacks that occurred until time t. In equation,

L i = ∑ i = 1 N ⁢ C i = ∑ i = 1 N ⁢ ∑ y ∈ i ⁢ ( x i ) ⁢ C y i .

As previously mentioned, the frequent demand of insurance pricing is to compute the mean and the variance of L_(t) (see [14] or [20]). Note that, because the random trees, percolation processes on those tree, and local costs on the vertices are independent and identically distributed for different cyber attacks, by conditioning on the number of attacks N_(t) until time t, we get E(L _(t) |N _(t) =n)−E(C ¹ + . . . +C ^(Nt) |N _(t) =n)=nE(C ¹) Var(L _(t) |N _(t) =n)=Var(C ¹ + . . . +C ^(Nt) |N _(t) =n)=n Var(C ¹)

The first equation implies that E(L _(t))=E(E(L _(t) |N _(t)))=E(N _(t) E(C ¹))=E(N _(t))E(C ¹)=λtE(C ¹)  (1)

and while using also the second equation and the law of total variance, we get

$\begin{matrix} \begin{matrix} {{{Var}\left( L_{t} \right)} = {E\left( {{{Var}\left( {L_{t}\left. N_{t} \right)} \right)} + {{Var}\left( {E\left( {L_{t}\left. N_{t} \right)} \right)} \right.}} \right.}} \\ {= \begin{matrix} {{{E\left( {N_{t}{{Var}\left( C^{1} \right)}} \right)} + {{Var}\left( {N_{t}{E\left( C^{1} \right)}} \right)}} = {{{E\left( N_{t} \right)}{{Var}\left( C^{1} \right)}} +}} \\ {{{Var}\left( N_{t} \right)}{\left( {E\left( C^{1} \right)} \right)^{2}.}} \end{matrix}} \end{matrix} & (2) \end{matrix}$

Equations (1) and (2) show that the mean and variance of the total loss up to time t can be easily expressed using the mean and variance of the cost of one cyber attack. Therefore, we now focus on the cost of a single attack, and drop all the superscripts i referring to the number of the attack to avoid cumbersome notations.

Remark 1—Consider a scenario where, for a given (realization of a random) tree, instead of assuming that all nodes contain unique data, there exist at least two nodes x and y that share data of value C_(xy). Our model assumes that the two vertices being hacked results in a loss of C_(x)+C_(y) whereas the actual loss when data are shared is just C_(x)+C_(y)−C_(xy). In particular, our model gives an upper bound for the loss over all the cases where data are shared among nodes.

Analytical Results

Recall that the number of edges starting from each vertex moving away from the root is described by the random variable X with probability mass function (p_(k))_(k=0) ^(∞). To state our result, we let

$\mu = {{E(X)} = {{\sum\limits_{k = 0}^{\infty}{{kp}_{k}\mspace{14mu}{and}\mspace{14mu}\sigma^{2}}} = {{{Var}(X)} = {\sum\limits_{k = 0}^{\infty}{\left( {k - \mu} \right)^{2}p_{k}}}}}}$

be the mean and the variance of X. Throughout this paper, probabilities, expected values and variances with the subscript r denote their conditional counterparts given that the cyber attack starts from a vertex located at distance r from the root. To begin with, we study the process when the attack starts from the root, meaning that r=0. In this case, the number of vertices being hacked is related to the number of individuals in a branching process. The analysis is simplified and the mean and variance of the total cost can be computed explicitly due to spherical symmetry, and conveniently expressed using the mean and variance μ and σ² introduced above.

Theorem 2—For an attack starting at the root, E ₀(C)=E ₀(S)E(C ₀) and Var₀(C)=E ₀(S)Var(C ₀)+Var₀(S)(E(C ₀))² where the mean and variance of S are given by:

${E_{0}(S)} = \frac{1 - \left( {\mu\; p} \right)^{R + 1}}{1 - {\mu\; p}}$ ${{Var}_{0}(S)} = {\frac{{{p\left( {1 - p} \right)}\mu} + {p^{2}\sigma^{2}}}{\left( {1 - {\mu\; p}} \right)^{2}}{\left( {\frac{1 - \left( {\mu\; p} \right)^{{2R} + 1}}{1 - {\mu\; p}} - {\left( {{2R} + 1} \right)\left( {\mu\; p} \right)^{R}}} \right).}}$

It follows from the theorem that the expected cost is infinite on the limiting tree obtained by taking the limit as R→∞ when μp>1. In fact, not only the expected cost is infinite but also there is a positive probability that the cost is infinite. This probability can be expressed implicitly using the probability generating function ϕ(θ)=E(θ^(y)) of the random variable Y=Y ₁ +Y ₂ + . . . +Y _(X)

where the Y_(i) are independent Bernoulli random variables with parameter p. This probability is also related to what we shall call the radius R₀ of a cyber attack starting at the root: R ₀=max{d(0,y):y∈

(0)},

the maximum graph distance (number of edges) from the root among all vertices that are being hacked. The next result states that the radius of an attack decays exponentially when μp<1 whereas the attack reaches the leaves of the random tree with a positive probability that does not depend on the size of the tree when μp>1.

Theorem 3 (tail behavior)—For an attack starting at the root,

-   -   When μp<1, we have exponential decay: P₀(R₀≥n)≤(μp)^(n).     -   When μp>1, the function ϕ has a unique fixed point θ_(∞)Å(0, 1)         and         P ₀(R ₀ =R)≥1−θ_(∞)>0 for all R>0.

The expected value, variance and tail distribution are much more difficult to study when the cyber attack does not start from the root. However, we were able to compute explicitly the expected value for all r≤R using combinatorial techniques to count the number of self-avoiding paths of a given length starting from the origin of the attack. To state our results, we let

$\overset{\_}{\mu} = {E\left( {{X - {1\left. {X \neq 0} \right)}} = {\frac{\mu}{1 - p_{0}} - 1.}} \right.}$

Theorem 4 (mean)—Let R be even. Then E_(r)(C)=E_(r)(S)E(C₀) where

$\begin{matrix} {{E_{r}(S)} = {{\frac{1}{1 - {\mu\; p}}\left( {1 + \frac{{p\left( {1 - p^{r}} \right)}\left( {1 - {\left( {\mu - \overset{\_}{\mu}} \right)p}} \right)}{1 - p} - \frac{\left( {\mu\; p} \right)^{R - r + 1}\left( {1 - {p^{2}\left( {\mu - {\overset{\_}{\mu}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)}} \right)}} \right)}{1 - {\mu\; p^{2}}}} \right)\mspace{14mu}{for}\mspace{14mu}{all}\mspace{14mu} 0} \leq r \leq {R.}}} & (3) \end{matrix}$

Even though it is not obvious from the theorem, it can be proved (and it is intuitively clear) that the expected cost is increasing with respect to the radius R. Indeed, 1−μp ²>0 if and only if 1−(μp ²)^(r)>0 if and only if 1−p ²(μ− ^(μ(1−(μp) ² ⁾ ^(r) ⁾⁾>1−μp ²

which, together with the fact that 1−μp>0 if and only if μp<1 if and only if R

−(μp)^(R−r+1) is increasing.

implies that the expression φ_(r)(R) on the right-hand side of (3) is increasing in R. The assumption R even is purely technical, but the proof of the theorem (see Lemma 11 below) shows that the expected cost is globally increasing with respect to R. In particular, the expected cost for R odd can be bounded from below and from above using the theorem and the fact that ϕ_(r)(R−1)<E _(r)(S)<ϕ_(r)(R+1) for all 0≤r≤R.

Another consequence of Lemma 11 that is again intuitively obvious is that, everything else being fixed, the expected cost of a cyber attack is increasing with respect to p. The expected cost, however, is not monotone with respect to the distance r. For instance, when the network is the deterministic regular tree with degree d, at least for small values of p, the expected cost first increases while moving away from the root but then decreases while moving closer to the leaves (see Table 2 below). The variance of the cost caused by an attack starting at vertex x≠0 is much more difficult to compute. However, a simple qualitative argument gives the following upper bound.

Theorem 5 (variance)—For all r≤R, Var_(r)(C)≤E _(r)(S)Var(C ₀)+(p ^(−2R)−1)(E _(r)(S)E(C ₀))².

Our theorems only give conditional mean and variance given that the cyber attack starts at a certain distance from the root. These results, however, can be used to study the unconditional mean and variance that appear in (1) and (2) for deterministic trees with X=d where each vertex has degree d+1, except for the root that has degree d and for the leaves that have degree one. For this particular graph, applying Theorems 4 and 5 gives the following corollary.

Corollary 6 (regular tree)—For the regular tree with X=d,

$\mspace{20mu}{{E(C)} = {\left( \frac{1 - d}{1 - d^{R + 1}} \right){\sum\limits_{r = 0}^{R}{d^{r}{E_{r}(S)}{E\left( C_{0} \right)}}}}}$ ${{Var}(C)} \leq {\left( \frac{1 - d}{1 - d^{R + 1}} \right)\left( {{\sum\limits_{r = 0}^{R}{d^{r}{E_{r}(S)}{{Var}\left( C_{0} \right)}}} + {\left( {{\sum\limits_{r = 0}^{R}{d^{r}{p^{{- 2}R}\left( {E_{r}(S)} \right)}^{2}}} - {\left( \frac{1 - d}{1 - d^{R + 1}} \right){\sum\limits_{r,s}{d^{r + s}{E_{r}(S)}{E_{s}(S)}}}}} \right)\left( {E\left( C_{0} \right)} \right)^{2}}} \right)}$ where the conditional expected size is given by

${E_{r}(S)} = {\left( \frac{1}{1 - {dp}} \right){\left( {1 + {p\left( {1 - p^{r}} \right)} - \frac{({dp})^{R - r + 1}\left( {1 - {p^{2}\left( {1 + {\left( {d - 1} \right)\left( {dp}^{2} \right)^{r}}} \right)}} \right)}{1 - {dp}^{2}}} \right).}}$

Looking at the tail distribution, for the regular tree, the random variable Y is the sum of X=d independent Bernoulli random variables with parameter p so it is binomial and we get

${\phi(\theta)} = {{E\left( \theta^{Y} \right)} = {{\sum\limits_{k = 0}^{\infty}{\theta^{k}{P\left( {Y = k} \right)}}} = {{\sum\limits_{k = 0}^{d}{\begin{pmatrix} d \\ k \end{pmatrix}{p^{k}\left( {1 - p} \right)}^{d - k}\theta^{k}}} = {\left( {1 - p + {p\;\theta}} \right)^{d}.}}}}$

For instance, for the binary tree with d=2, we have

⌀(θ) − θ = (1 − p + p θ)² − θ = p²θ² − (p² + (1 − p)²)θ + (p − 1)² = p²(θ − 1)(θ − (1 − 1/p)²)

showing that, for an attack starting at the root of the infinite binary tree,

${P_{0}\left( {R_{0} = R} \right)} \geq {1 - {\left( {1 - \frac{1}{p}} \right)^{2}\mspace{14mu}{when}\mspace{14mu} p}} > \frac{1}{2}$

while P₀(R₀≥n)≤(2p)^(n) when p<½, according to Theorem 3.

Numerical Results

For numerical example, we perform pricing (P) of cyber risk insurance under the assumptions of the developed model, i.e., given a small or medium-sized enterprise with a tree based LAN topology. To that aim, we consider actuarial fair premium: P=E(L) expectation principle: P=E(L)+δE(L) standard deviation principle: p=E(L)+δ√{square root over (var(L))} semi-variance principle: P=E(L)+δE((L−E(L))⁺)²

as pricing principles. We keep parameter δ−0.1. Without loss of generality, we assume λ=1, thus the attacks occur at rate one per unit of time. In practice, to choose the rate of attacks λ, an insurance company should petition its cyber risk due to breach policyholders for information about the average number of attacks per unit of time.

Three choices of probability mass function [p₀, p₁, p₂, p₃] for random tree edge formation are considered. The tree with probabilistic formation of edges under consideration was characterized by the probability mass function [0.0, 0.1, 0.3, 0.6]. Hence, for each vertex, the probability of zero offspring p₀ with this choice is made to be zero. The probability of one offspring is p₁=0.1, the probability of two offspring is p₂=0.3, and the probability of three offspring is p₃=0.6. The deterministic tree characterized by the probability mass function of edge formation [0, 0, 0, 1], in number of vertices, dominates the above chosen tree with probabilistic formation of edges, given the same radius R. Across all experiments the radius of trees is chosen to be four. Also, given the same radius, the chosen probabilistic tree, in terms of number of vertices, dominates the deterministic trees characterized by the probability mass function of edge formation [0, 1, 0, 0].

We allow for four types of distribution of cost C due to breach: deterministic, Bernoulli distribution, log-normal distribution both with high and low variance and normal distribution with low variance. The choice of expectation of cost E(C) was stylized, kept to 1000 monetary units, and made consistent across all cost distributions. The standard deviation of cost √{square root over (Var(C))} is allowed to change across distributions such that grows from 0 for deterministic case, to approximately 284 for Bernoulli distributed cost and is either 50 or 500 for log-normally distributed cost. The normal distribution, due to its support being the entire real line, for theoretical reasons, is not an appropriate choice for the distribution of costs. Nevertheless, for purposes of comparing the impact of choice of distribution on pricing, we consider it and choose its standard deviation to be 50. Finally, two cases of probability of edge contagions were considered: low probability of edge contagion characterized by p=0.2 and high probability of edge contagion characterized by p=0.8. In practice, to choose the probability of the edge contagion p, an insurance company may perform risk classification by clustering their portfolio of cyber risk due to breach policyholders according to their adherence to a particular cyber risk management methodology. Regretfully, in practice, an exact value of p is unknowable. Thus, for risk classes, according to their riskiness, and based on its judgment, an actuary should impute values of edge contagion

Within this experiment setting, to calculate the cyber risk insurance premiums, we perform one million simulations. Our unreported experiments confirm that the number of simulations performed is sufficient to achieve prices stability and the desired accuracy.

There are several findings that can be derived from numerical results presented in Table 1 herein. First, for a fixed probability of edge contagion, the premium under the actuarial fair principle is consistently higher for the deterministic tree with p₃=1.0 than for the tree with probabilistic edge formation. Conversely, the premium under the actuarial fair principle is consistently lower for the deterministic tree with p₁=1.0 than for the tree with probabilistic edge formation. This is also true for premiums based on expectations principle. Second, given standard deviation principle and for a fixed probability of edge contagion and a fixed tree type, keeping the same expectation of cost, as variance increases across cost probability distributions under consideration, the premiums increase. This is also true for premiums calculated based on semi-variance principle. Third, given any probability distribution of costs, for the same maximum radius and the same probability of edge contagion, the tree edge formation probability mass function has impact on prices such that trees with higher average number of vertices yield higher premiums. Fourth, comparing the prices based on semi-variance principle between log-normally and normally distributed costs with same mean and variance, we conclude that the choice of cost distribution (not just its mean and variance) has an impact on prices.

Looking at the examples of distributions of loss due to breach in FIGS. 2A-2F and 3A-3F, we observe that these distributions loosely follow log-normal patterns and exhibit multi-modality for lower probability of edge contagion. Interestingly, the log-normal pattern observed in our structural model might serve as theoretical underpinning of empirical findings for loss distributions based on data sets under their consideration. Unlike the loss distribution in FIGS. 2A-3F, accounting for the case of high volatility cost that has the continuous interval for support, the loss distribution in FIGS. 3A-3F shows support emerging as union of continuous subintervals. Unreported graphs of loss distributions due to breach, under the assumption of deterministic cost as well as Bernoulli distributed cost, present similar patterns of log-normal shape with multi-modalities. However, for deterministic and Bernoulli distributed costs, loss distributions due to breach have only discrete support.

For a comparison between the analytical results in Theorem 4 and simulation based results, we refer to Table 2 below. The table displays the analytical versus simulated values of the conditional expected size of a cyber attack given that the attack starts at level r. There, a deterministic tree with three offspring d=3, i.e., with probability mass function [0, 0, 0, 1], having four levels R=4 was assumed. The two probabilities of edge contagion are considered: p=0.2 and p=0.8. The simulation results are based on one million independent realizations. The table shows an almost perfect match between the analytical and simulation results. We also note that unreported experiments reveal that an increase in the number of simulations further reduces the difference between analytical and simulation results, in agreement with the strong law of large numbers.

In the present disclosure, as a main theoretical contribution to existing actuarial literature, we develop a dynamic structural percolation model of loss distribution for cyber risk of small or medium-sized enterprises with tree based LAN topology. Specifically, by focusing on the physical layer of LAN topology, by imposing simple contagion process on network based on percolation theory, and by introducing the topology of costs, we make the three major conceptual insights of this work. Based on them, we robustly reduce the complexity of cyber risk phenomena and allow for an effective modeling and pricing. From modeling stand point, we allow for dynamic nature of LAN topology, as well as temporal uncertainty of costs due to data breach. Within a rigorous mathematical framework through probabilistic and combinatorial analysis, we characterize the main aspects of loss distribution due to breaches. With the appropriate numerical experiments, we credibly demonstrate the practical aspect of this research reflected in a cyber risk due to breach pricing methodology. Because cyber risk represents a significant emerging opportunity for insurers, the pricing of this new type of risk proposed in this paper can prove to be of considerable value.

TABLE 1 The prices of cyber risk due to breach based on Actuarial Fair Premium, Expectation Principle, Standard Deviation Principle and Semi-Variance Principle. The Deterministic, Bernoulli, Log- normally and Normally distributed costs are considered given three probability mass functions for random tree formation and two probabilities of edge contagion. The pricing is based on one million simulations where δ = 0.1. Premium Type Actuarial Fair Standard Semi Premium Expectation Deviation Variance Principle Principle Principle Principle Cost Distribution [p0, p1, p2, p3] p E[L] E[I] + δE[L] E[L] + δ{square root over (Var[L])} E[L] + δE[(L − E[L])⁺]² Deterministic [0.0, 0.0, 0.0, 1.0] 0.2 1,595.38 1,754.91 1,706.47 17,787.03 P[C = 1000] = 1 [0.0, 0.1, 0.3, 0.6] 0.2 1,553.99 1,709.39 1,655.12 15,602.43 E[C] = 1000 [0.0, 1.0, 0.0, 0.0] 0.2 1,344.39 1,478.83 1,405.26 7,489.87 {square root over (Var[C])} = 0 [0.0, 0.0, 0.0, 1.0] 0.8 16,655.50 18,321.04 17,820.24 2,782,446.38 [0.0, 0.1, 0.3, 0.6] 0.8 11,684.72 12,853.19 12,511.66 1,263,035.38 [0.0, 1.0, 0.0, 0.0] 0.8 3,096.98 3,406.68 3,204.55 24,481.95 Bernoulli [0.0, 0.0, 0.0, 1.0] 0.2 1,596.17 1,755.79 1,718.31 19,694.42 P[C = 5,000.00] = 0.01 [0.0, 0.1, 0.3, 0.6] 0.2 1,554.82 1,710.30 1,667.97 17,445.23 P[C = 959.60] = 0.99 [0.0, 1.0, 0.0, 0.0] 0.2 1,345.49 1,480.04 1,422.23 8,892.72 E[C] ≈ 1000 [0.0, 0.0, 0.0, 1.0] 0.8 16,659.74 18,325.71 17,836.73 2,802,110.94 {square root over (Var[C])} ≈ 284 [0.0, 0.1, 0.3, 0.6] 0.8 11,689.78 12,858.76 12,528.26 1,284,190.50 [0.0, 1.0, 0.0, 0.0] 0.8 3,094.22 3,403.65 3,222.74 27,766.60 Log-normal [0.0, 0.0, 0.0, 1.0] 0.2 1,596.18 1,755.80 1,724.09 22,078.89 E[C] = 1000 [0.0, 0.1, 0.3, 0.6] 0.2 1,553.20 1,708.52 1,671.82 19,709.60 {square root over (Var[C])} = 500 [0.0, 1.0, 0.0, 0.0] 0.2 1,344.41 1,478.85 1,428.50 11,470.20 [0.0, 0.0, 0.0, 1.0] 0.8 16,652.75 18,318.02 17,835.26 2,814,229.56 [0.0, 0.1, 0.3, 0.6] 0.8 11,687.73 12,856.51 12,532.14 1,299,585.20 [0.0, 1.0, 0.0, 0.0] 0.8 3,094.18 3,403.60 3,233.21 34,470.74 Log-normal [0.0, 0.0, 0.0, 1.0] 0.2 1,597.25 1,756.97 1,708.93 17,876.39 E[C] = 1000 [0.0, 0.1, 0.3, 0.6] 0.2 1,555.11 1,710.62 1,656.51 15,638.29 {square root over (Var[C])} = 50 [0.0, 1.0, 0.0, 0.0] 0.2 1,343.55 1,477.90 1,404.64 7,470.12 [0.0, 0.0, 0.0, 1.0] 0.8 16,660.19 18,326.20 17,825.25 2,783,400.15 [0.0, 0.1, 0.3, 0.6] 0.8 11,703.76 12,874.14 12,530.80 1,261,778.29 [0.0, 1.0, 0.0, 0.0] 0.8 3,095.74 3,405.32 3,203.78 24,632.38 Normal [0.0, 0.0, 0.0, 1.0] 0.2 1,595.35 1,754.88 1,706.93 17,810.30 E[C] = 1000 [0.0, 0.1, 0.3, 0.6] 0.2 1,503.81 1,709.12 1,655.24 15,612.43 {square root over (Var[C])} = 50 [0.0, 1.0, 0.0, 0.0] 0.2 1,344.25 1,478.68 1,405.41 7,488.77 [0.0, 0.0, 0.0, 1.0] 0.8 16,629.39 18,292.33 17,795.39 2,789,773.43 [0.0, 0.1, 0.3, 0.6] 0.8 11,686.85 12,855.53 12,513.20 1,260,716.39 [0.0, 1.0, 0.0, 0.0] 0.8 3,097.32 3,407.05 3,205.29 24,610.16

There are ample opportunities for further research following this work. Two most prominent are: cyber risk loss modeling in a general LAN topology as well as risk management of cyber risk liabilities for a diversified portfolio of insurance policies written on companies across multiple industries and sectors in general economy.

TABLE 2 The analytical versus simulated values of the expected size of cyber attack due to breach E_(r)(S) given level r are presented. A deterministic tree (d = 3) having four levels (R = 4) was assumed. E_(r)(S) assuming [p₀, p₁, p₂, p₃] = [0, 0, 0, 1] p = 0.2 p = 0.8 r Analytical Simulated Analytical Simulated 0 2.177552 2.176000 22.976907 22.984000 1 2.315189 2.316800 21.681858 21.684800 2 2.001532 1.999360 18.561597 18.571840 3 1.360507 1.359872 15.208322 15.217472

Proof of Theorem 2

The first ingredient to prove Theorem 2 is to observe that, when the cyber attack starts at the root of the random tree, the number of vertices being hacked is related to the number of individuals in a certain branching process. Indeed, let Xn be the number of vertices at distance n from the root being hacked. Because the number of edges starting from each vertex and moving away from the root is described by X and that these edges are independently open with probability p, X _(n+1) −Y _(n,1) +Y _(n,2) + . . . +Y _(n,Xn)  (4)

where the random variables Yn,i are independent and equal in distribution to the sum of X independent Bernoulli random variables with parameter p. This implies that (Xn) is the branching process with offspring distribution Y. In particular, the expected value and variance of the global cost C can be conveniently written using the expected value and variance of the offspring distribution, so we start by computing these two quantities in the following lemma.

Lemma 7—Recalling that μ=E(X) and σ²=Var(X), we have v=E(Y)=μp and Σ²=Var(Y)=p(1−p)μ+p ²σ².

PROOF. Using that the random variable Y is defined as the sum of X independent Bernoulli random variables Y_(i) with parameter p, and conditioning on X, we get E(Y|X)=E(Y ₁ + . . . +Y _(x) |X)=XE(Y _(i))=pX Var(Y|X)=Var(Y ₁ + . . . +Y _(x) |X)=X Var(Y _(i))=p(1−p)X.  (5)

Taking the expected value in the first line of (5) gives

Σ² = Var(Y) = E(Var(Y|X)) + Var(E(Y|X)) = E(p(1 − p)X) + Var(pX) = p(1 − p)E(X) + p²Var(X) = p(1 − p)μ + p²o².

This completes the proof. □

Another key to proving Theorem 2 is the fact that, because the local costs are independent and identically distributed, the expected value and variance of the global cost C can be easily expressed as a function of the expected value and variance of the size S of the attack, i.e., the number of vertices being hacked. Using also that the number of vertices being hacked is related to the branching process (X_(n)), we can prove the following two lemmas.

Lemma 8—Assume that G=

_(R). Then,

${E_{0}(S)} = {\frac{1 - v^{R + 1}}{1 - v}.}$

PROOF. Recalling that X_(n) denotes the number of vertices being hacked due to an attack starting at the root of the random tree G=

_(R), we have X _(n)=card{y∈V:y∈C(0) and d(0,y)=n}

which, together with S=card (C (0)), implies that

$\begin{matrix} {{E_{0}(S)} = {{E\left( {\sum\limits_{n = 0}^{R}\mspace{14mu}{{card}\mspace{14mu}\left\{ {{y \in {V:{y \in {(0)\mspace{14mu}{and}\mspace{14mu}{d\left( {0,y} \right)}}}}} = n} \right\}}} \right)} = {{E\left( {\sum\limits_{n = 0}^{R}X_{n}} \right)}.}}} & (6) \end{matrix}$

In addition, taking the expected value in (4), we get

E(X_(n + 1)) = E(E(X_(n + 1)|X_(n))) = E(E(Y_(n, 1) + Y_(n, 2) + …   + Y_(n, Xn)|X_(n))) = E(X_(n)E(Y)) = E(X_(n))E(Y) = vE(X_(n))

so a simple induction gives E(X _(n))=vE(X _(n−1))−v ² E(X _(n−2))= . . . =v ^(n) E(X ₀)−v ^(n).  (7)

Combining (6) and (7), we conclude that

${E_{0}(S)} = {{E\left( {\sum\limits_{n = 0}^{R}X_{n}} \right)} = {{\sum\limits_{n = 0}^{R}{E\left( X_{n} \right)}} = {{\sum\limits_{n = 0}^{R}v^{n}} = {\frac{1 - v^{R + 1}}{1 - v}.}}}}$

This completes the proof. □

Lemma 9—Assume that G=

_(R). Then,

${{Var}_{0}(S)} = {{\Sigma^{2}\left( {{\sum\limits_{n = 0}^{R}\left( {v^{n - 1}{\sum\limits_{k = 0}^{n - 1}v^{k}}} \right)} + {2{\sum\limits_{m = 1}^{R}{\sum\limits_{n = 0}^{m - 1}\left( {v^{m - 1}{\sum\limits_{k = 0}^{n - 1}v^{k}}} \right)}}}} \right)}.}$

PROOF. Using as in the proof of Lemma 8 that the size S coincides with the sum of the X_(n) when the attack starts at the root, we obtain

$\begin{matrix} {{{Var}_{0}(S)} = {{{Var}\left( {\sum\limits_{n = 0}^{R}X_{n}} \right)} = {{\sum\limits_{n = 0}^{R}{{Var}\left( X_{n} \right)}} + {2{\sum\limits_{m = 1}^{R}{\sum\limits_{n = 0}^{m - 1}{{{cov}\left( {X_{n},X_{m}} \right)}.}}}}}}} & (8) \end{matrix}$

Now, using independence, we get E(X _(n) |X _(n−1))=E(Y _(n−1,1) + . . . Y _(n−1,Xn−1) |X _(n−1))=X _(n−1) E(Y)=vx _(n−1) Var(X _(n) −X _(n−1))=Var(Y _(n−1,1) + . . . +Y _(n−1,Xn−1) |X _(n−1))=X _(n−1) Var(Y)=Σ² X _(n−1)

which, together with the law of total variance, gives

Var(X_(n)) = E(Var(X_(n)|X_(n − 1))) + Var(E(X_(n)|X_(n − 1))) = E(Σ²X_(n − 1)) + Var(vX_(n − 1)) = Σ²v^(n − 1) + v²Var(X_(n − 1)).

Using a simple induction, we deduce that

$\begin{matrix} {{{Var}\left( X_{n} \right)} = {{{\Sigma^{2}v^{n - 1}} + {v^{2}\left( {{\Sigma^{2}v^{n - 2}} + {v^{2}{{Var}\left( X_{n - 2} \right)}}} \right)}} = {{{\Sigma^{2}v^{n - 1}} + {\Sigma^{2}v^{n}} + {v^{4}{{Var}\left( X_{n - 2} \right)}}} = {{{\Sigma^{2}v^{n - 1}} + {\Sigma^{2}v^{n}} + \ldots\mspace{14mu} + {\Sigma^{2}v^{{2n} - 2}} + {v^{2n}{{Var}\left( X_{0} \right)}}} = {\Sigma^{2}v^{n - 1}{\sum\limits_{k = 0}^{n - 1}{v^{k}.}}}}}}} & (9) \end{matrix}$

In addition, E(X_(m)|X_(n))=v^(m−n)X_(n) for all n≤m therefore

cov(X_(n), X_(m)) = E(X_(n)X_(m)) − E(X_(n))E(X_(m)) = E(X_(n)E(X_(m)|X_(n))) − v^(n)v^(m) = v^(m − n)E(X_(n)²) − v^(n + m) = v^(m − n)Var(X_(n)) + v^(m − n)E(X_(n))² − v^(n + m) = v^(m − n)Var(X_(n)) + v^(m − n)v^(2n) − v^(n + m) = v^(m − n)Var(X_(n))

which, together with (9), gives

$\begin{matrix} {{{cov}\left( {X_{n},X_{m}} \right)} = {{\Sigma^{2}v^{m - 1}{\sum\limits_{k = 0}^{n - 1}{v^{k}\mspace{14mu}{for}\mspace{14mu}{all}\mspace{14mu} n}}} \leq {m.}}} & (10) \end{matrix}$

Combining (8)-(10), we conclude that

${{Var}_{0}(S)} = {\Sigma^{2}\left( {{\sum\limits_{n = 0}^{R}\left( {v^{n - 1}{\sum\limits_{k = 0}^{n - 1}v^{k}}} \right)} + {2{\sum\limits_{m = 1}^{R}{\sum\limits_{n = 0}^{m - 1}\left( {v^{m - 1}{\sum\limits_{k = 0}^{n - 1}v^{k}}} \right)}}}} \right)}$

This completes the proof.

Computing the sums in the previous lemma explicitly, we obtain the following result whose proof is purely computational and done in the appendix.

Lemma 10—Assume that G=

_(R). Then,

${{Var}_{0}(S)} = {\frac{\Sigma^{2}}{\left( {1 - v} \right)^{2}}{\left( {\frac{1 - v^{{2R} + 1}}{1 - v} - {\left( {{2R} + 1} \right)v^{R}}} \right).}}$

Given each realization of the random graph G=

_(R), the local costs (C_(y)) are independent and identically distributed. In addition,

$S = {{{card}\mspace{11mu}\left( {(0)} \right)\mspace{14mu}{and}\mspace{14mu} C} = {\sum\limits_{y \in {(0)}}C_{y}}}$

when the attack starts at the root, therefore E ₀(C|S)−SE(C ₀) and Var₀(C|S)−S Var(C ₀).  (11)

Taking the expected value in the first equation of (11), we get E ₀(C)=E(E ₀(C|S))=E ₀(SE(C ₀))=E ₀(S)E(C ₀)   (12)

while using the law of total variance and (11) gives

$\begin{matrix} \begin{matrix} {{{Var}_{0}(C)} = {{E\left( {{Var}_{0}\left( C \middle| S \right)} \right)} + {{Var}\left( {E_{0}\left( C \middle| S \right)} \right)}}} \\ {= {{E_{0}\left( {S\;{{Var}\left( C_{0} \right)}} \right)} + {{Var}_{0}\left( {{SE}\left( C_{0} \right)} \right)}}} \\ {= {{{E_{0}(S)}{{Var}\left( C_{0} \right)}} + {{{Var}_{0}(S)}{\left( {E\left( C_{0} \right)} \right)^{2}.}}}} \end{matrix} & (13) \end{matrix}$

In addition, combining Lemmas 7 and 8, we get

$\begin{matrix} {{E_{0}(S)} = {\frac{1 - v^{R + 1}}{1 - v} = \frac{1 - \left( {\mu\; p} \right)^{R + 1}}{1 - {\mu\; p}}}} & (14) \end{matrix}$

while combining Lemmas 7 and 10, we get

$\begin{matrix} \begin{matrix} {{{Var}_{0}(S)} = {\frac{\Sigma^{2}}{\left( {1 - v} \right)^{2}}\left( {\frac{1 - v^{{2R} + 1}}{1 - v} - {\left( {{2R} + 1} \right)v^{R}}} \right)}} \\ {= \frac{{{p\left( {1 - p} \right)}\mu} + {p^{2}\sigma^{2}}}{\left( {1 - {\mu\; p}} \right)^{2}}} \\ {\left( {\frac{1 - \left( {\mu\; p} \right)^{{2R} + 1}}{1 - {\mu\; p}} - {\left( {{2R} + 1} \right)\left( {\mu\; p} \right)^{R}}} \right).} \end{matrix} & (15) \end{matrix}$

The theorem follows from (12)-(15).

Proof of Theorem 3

The proof follows from well-known results for branching processes. Recalling that X_(n) denotes the number of hacked vertices at distance n from the root, for all 0≤n<R, {R ₀ =n}={X _(n)≠0}∩{X _(n+1)=0}and {R ₀ =R}={X _(R)≠0}.

Recalling also from (7) that E(X_(n))=(μp)^(n), we deduce that

${P_{0}\left( {R_{0} \geq n} \right)} = {{P\left( {X_{n} \neq 0} \right)} = {{{\sum\limits_{k = 1}^{\infty}{P\left( {X_{n} = k} \right)}} \leq {\sum\limits_{k = 0}^{\infty}{{kP}\left( {X_{n} = k} \right)}}} = {{E\left( X_{n} \right)} = \left( {\mu\; p} \right)^{n}}}}$

which proves exponential decay of the radius of the cyber attack when μp<1. To deal with the supercritical case μp>1, we simply observe that P ₀(R ₀ =R)≥P(x _(n)≠0 for all n>0)

and that the right-hand side is the survival probability of the branching process which is known to be equal to 1−θ_(∞) where θ_(∞) is the unique fixed point in the open interval (0, 1) of the probability generating function ϕ. We refer the reader to [21, Theorem 6.2] for a proof.

Proof of Theorem 4*

The proof of Theorem 4 is rather long so, before going into the details, we briefly explain the strategy and the different steps of the proof. The starting point is to observe that, because of the underlying tree structure, there is exactly one path connecting any two vertices. This implies that the probability that a given vertex y is being hacked is simply p at the power the graph distance between the original point of the attack and vertex y. In particular, to compute the expected size of the cyber attack, it suffices to count the number of self-avoiding paths of a given length n starting from a given vertex. The approach to count the number of paths starting from a vertex x at distance r from the root differs depending on r. Also, we distinguish between two cases and find different expressions for the expected size:

${E_{r}(S)} = \left\{ \begin{matrix} {\phi_{-}(r)} & {{{when}\mspace{14mu} 0} < {2r} < R} \\ {\phi_{+}(r)} & {{{when}\mspace{14mu} 2r} > R > {r.}} \end{matrix} \right.$

See FIG. 3 for a picture of these two cases. Then, we cover the boundary cases, and show that the two expressions above extend to these particular cases: E ₀(S)=ϕ_(0), E _(R/2)(S)=ϕ_(R/2), E _(R)(S)=ϕ₊(R).

The last step to complete the proof of the theorem is purely computational and consists in proving that the two expressions above in fact coincide and that, for all 0≤r≤R,

${\phi_{-}(r)} = {{\phi_{+}(r)} = {\frac{1}{1 - {\mu\; p}}{\left( {1 + \frac{{p\left( {1 - p^{r}} \right)}\left( {1 - {\left( {\mu - \overset{\sim}{\mu}} \right)p}} \right)}{1 - p} - \frac{\left( {\mu\; p} \right)^{R - r + 1}\left( {1 - {p^{2}\left( {\mu - \overset{\_}{\left. {\overset{\_}{\mu}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)} \right)}} \right)}} \right.}{1 - {\mu\; p^{2}}}} \right).}}}$

Connection with the number of self-avoiding paths.

As previously, because the local costs are independent and identically distributed, there is a simple relationship between the expected cost C and the expected size S. In particular, equation (12) above again holds in the general case where the attack starts at a vertex at distance r>0 from the root, so the first step is to compute the expected size. However, because the attack does not start at the root, the connection with branching process can no longer be used. Instead, the main ingredient is to evaluate the following expected number of vertices/self-avoiding paths Γ_(r)(n)=E(card {y∈V:d(x,y)=n})   (16)

where it is assumed that d(0, x)=r and the attack starts at x.

Lemma 11—Assume that G=

_(R). Then,

${E_{r}(S)} = {\sum\limits_{n = 0}^{\infty}{{\Gamma_{r}(n)}{p^{n}.}}}$

PROOF. The probability of vertex y being hacked is equal to the probability of an open path connecting x where the attack starts from and y. Because there is exactly one self-avoiding path connecting the two vertices due to the lack of cycles, and because the edges are independently open with probability p, the probability of y being hacked is simply p at the power the length of this path, which is also the distance d(x, y) between the two vertices. In addition, letting ζ(y)=1{vertex y is hacked} for all y∈V,

the number of vertices being hacked can be written

$S = {{{card}\left( {(x)} \right)} = {{\sum\limits_{y \in V}{\zeta(y)}} = {\sum\limits_{n = 0}^{\infty}{\sum\limits_{y \in V}{{\zeta(y)}1{\left\{ {{d\left( {x,y} \right)} = n} \right\}.}}}}}}$

Using that vertex y is being hacked with probability E(ζ(y))=p^(d(x,y)) and taking the expected value in the previous equation, we deduce that

${E_{r}(S)} = {{\sum\limits_{n = 0}^{\infty}{{E\left( {{card}\left\{ {{y \in {V\text{:}\mspace{11mu}{d\left( {x,y} \right)}}} = n} \right\}} \right)}p^{n}}} = {\sum\limits_{n = 0}^{\infty}{{\Gamma_{r}(n)}{p^{n}.}}}}$

This completes the proof. □

The next step is to compute the expected number of vertices in (16). To do so, we need the following lemma about the conditional expected number of edges starting from a vertex.

Lemma 12—We have E(X−1|X≠0=μ.

PROOF. By conditioning on whether X≠0 or not, we get

$\begin{matrix} {{E(X)} = {{{E\left( X \middle| {X \neq 0} \right)}{P\left( {X \neq 0} \right)}} + {{E\left( {\left. X \middle| X \right. = 0} \right)}{P\left( {X = 0} \right)}}}} \\ {= {{E\left( X \middle| {X \neq 0} \right)}{P\left( {\times {\neq 0}} \right)}}} \end{matrix}$

from which we deduce that

${E\left( {X - 1} \middle| {X \neq 0} \right)} = {{\frac{E(X)}{P\left( {X \neq 0} \right)} - 1} = {{\frac{\mu}{1 - p_{0}} - 1} = {\overset{\_}{\mu}.}}}$

This completes the proof.

Counting the expected number of vertices relies on basic combinatorics and the following three properties that hold for every vertex x≠0.

There is exactly one edge starting from x moving toward the root.

The expected number of edges starting from x moving away from the root is μ.

Given that there is at least one edge starting from x moving away from the root, the expected number of additional such edges is μ

The approach to compute the number of vertices in (16) depends on the value of r and we now distinguish two cases: 0<2r<R and 2r>R>r.

Expected size when 0<2r<R.

We compute the expected number of vertices in (16) by distinguishing among what we shall call short, intermediate, and long paths. This distinction is motivated by the following:

Short paths starting from x with length n≤r can at most reach the root.

intermediate paths starting from x with length r<n≤R−r can at most reach one vertex at distance R from the root of the random tree, if any.

Long paths starting from x with length n>R−r can reach multiple vertices at distance R from the root of the random tree, if any.

For short paths, we have the following estimate.

Lemma 13—Assume n≤r and G=

_(R). Then,

${\Gamma_{r}(n)} = {1 + {\overset{\_}{\mu}\left( \frac{1 - \mu^{n + 1}}{1 - \mu} \right)} + {\mu^{n}.}}$

PROOF. The idea is to count the expected number of self-avoiding paths with length n starting at the origin x of the attack. Due to the absence of cycles, when following a path starting from x, once we move away from the root, we can no longer move closer to the root without visiting a vertex visited previously. In particular, following a self-avoiding path consists in moving n−k times in a row toward the root and then k times in a row away from the root. Therefore, letting

Γ_(r,k)(n)=expected number of paths moving k times away from the root,

and using the three properties above, we get Γ_(r,k)(n)=1x . . . x1x{umlaut over (μ)}·xμx . . . xμ=μ·μ ^(k−1) for all 1≤k<n  (17)

where {umlaut over (μ)} is the expected number of possible edges one can follow the first time moving away from the root of the random tree. Note that, when k=0, we move n times in a row toward the root and there is only one possible edge to follow at each step therefore Γ_(r,k)(n)=1x . . . x1=1 for k=0.  (18)

Finally, when k=n, we move n times in a row away from the root and there is an average of μ possible edges to follow at each step therefore Γ_(r,k)(n)=μx . . . xμ=μ ^(n) for k=n.  (19)

Adding (17)-(19), we conclude that, for n≤r,

${\Gamma_{r}(n)} = {{\sum\limits_{k = 0}^{n}\;{\Gamma_{r,k}(n)}} = {{1 + {\sum\limits_{k = 0}^{n - 2}{\overset{\_}{\mu}\mu^{k}}} + \mu^{n}} = {1 + {\overset{\_}{\mu}\left( \frac{1 - \mu^{n + 1}}{1 - \mu} \right)} + {\mu^{n}.}}}}$

This completes the proof. □

For intermediate paths, we have the following estimate.

Lemma 14—Assume r<n≤R−r and G=

_(R). Then,

${\Gamma_{r}(n)} = {{\overset{\_}{\mu}{\mu^{n - r - 1}\left( \frac{1 - \mu^{r}}{1 - \mu} \right)}} + {\mu^{n}.}}$

PROOF. The idea is again to count the number of self-avoiding paths starting from x. For paths with length r<n≤R−r, after r steps, we can only move away from the root. To count the number of such paths, we divide the path into three pieces by moving toward the root r−k times, then away from the root k times and away from the root another n−r times. After the first two steps, we moved r times therefore Γ_(r,n−r+k)(n)=μ^(n−r)Γ_(r,k)(r) for all 1≤k<r.

This, together with (17), implies that Γ_(r,n−r+k)(n)=μ^(n−r) μ ^(k−1)=μ ^(n−r+k−1) for all 1≤k<r.  (20)

Note that, when k=0, because n>r, the path must go through the root. The expected number of possible edges to follow right after visiting the root is while the expected number of possible edges to follow at each of the subsequent n−r−1 steps is μ therefore Γ_(r,n−r+k)(n)=μμ^(n=r−1) for k=0.  (21)

Finally, when k=r, we only move away from the root, so Γ_(r,n−r+k)(n)=μ^(n) for k=r.  (22)

Adding (20)-(22), we deduce that

${\Gamma_{r}(n)} = {{\sum\limits_{k = 0}^{r}\;{\Gamma_{r,{n - r + k}}(n)}} = {{{\sum\limits_{k = 0}^{r - 1}{\overset{\_}{\mu}\mu^{n - r + k - 1}}} + \mu^{n}} = {{\overset{\_}{\mu}{\mu^{n - r - 1}\left( \frac{1 - \mu^{r}}{1 - \mu} \right)}} + {\mu^{n}.}}}}$

and the proof is complete.

Finally, when R−r<n≤R+r, there is 1≤l≤r such that (a)n=R−r+2l−1 or (b)n=R−r+2l.

Distinguishing between case (a) and case (b), we have the next lemma.

Lemma 15—Assume G=

_(R). For all 1≤l≤r,

${\Gamma_{r}\left( {R - r + {2l} - 1} \right)} = {{\sum\limits_{k = l}^{r}{\overset{\_}{\mu}\mu^{R - r + {2l} - k - 2}}} = {\overset{\_}{\mu}{\mu^{R - {2r} + {2l} - 2}\left( \frac{1 - \mu^{r - l + 1}}{1 - \mu} \right)}}}$ ${\Gamma_{r}\left( {R - r + {2l}} \right)} = {{\sum\limits_{k = l}^{r}{\overset{\_}{\mu}\mu^{R - r + {2l} - k - 1}}} = {\overset{\_}{\mu}{\mu^{R - {2r} + {2l} - 1}\left( \frac{1 - \mu^{r - l + 1}}{1 - \mu} \right)}}}$

PROOF. In both cases (a) and (b), the first I moves must be directed toward the root for the self-avoiding path to stay within distance R from the root, so the number of moves k toward the root can be any l≤k≤r, while the number of moves away from the root is n−k. In addition, the same argument as in Lemmas 13 and 14 implies that Γ_(r,R−r+2l−k−1)(R−r+2l−1)=μ^(R−r+2l−k−2) for all l≤k≤r Γ_(r,R−r+2l−k)(R−r+2l)={umlaut over (μ)}μ^(R−r+2l−k−1) for all l≤k≤r.

Summing over all possible values of k gives the result. □

Combining Lemmas 11 and 13-15 and observing that Γr(n)=0 for all n>R+r, we obtain the following implicit expression for the expected value of the size:

${E_{r}(S)} = {1 + {\sum\limits_{n = 1}^{r}{\left( {1 + {\overset{\_}{\mu}\left( \frac{1 - \mu^{n - 1}}{1 - \mu} \right)} + \mu^{n}} \right)p^{n}}} + {\sum\limits_{n = {r + 1}}^{R - r}{\left( {{\overset{\_}{\mu}{\mu^{n - r - 1}\left( \frac{1 - \mu^{r}}{1 - \mu} \right)}} + \mu^{n}} \right)p^{n}}} + {\sum\limits_{l = 1}^{r}{\sum\limits_{k = l}^{r}{\overset{\_}{\mu}\left( {{\mu^{R - r + {2l} - k - 2}p^{R - r + {2l} - 1}} + {\mu^{R - r + {2l} - k - 1}p^{R - r + {2l}}}} \right)}}}}$

which can be reduced to

$\begin{matrix} {{E_{r}(S)} = {{\sum\limits_{n = 1}^{r}{\left( {1 + {\overset{\_}{\mu}\left( \frac{1 - \mu^{n - 1}}{1 - \mu} \right)}} \right)p^{n}}} + {\sum\limits_{n = {r + 1}}^{R - r}{\left( {\overset{\_}{\mu}{\mu^{n - r - 1}\left( \frac{1 - \mu^{r}}{1 - \mu} \right)}} \right)p^{n}}} + {\sum\limits_{n = 0}^{R - r}\left( {\mu\; p} \right)^{n}} + {\sum\limits_{l = 1}^{r}{\sum\limits_{k = l}^{r}{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}\mu^{R - r + {2l} - k - 2}{p^{R - r + {2l} - 1}.}}}}}} & (23) \end{matrix}$

Computing each sum in (23) gives the following lemma.

Lemma 16—For all 0≤r≤R, the right-hand side of (23) is

${\phi - (r)} = {{\left( \frac{1 - \left( {\mu - \overset{\_}{\mu}} \right)}{1 - \mu} \right)\left( \frac{p\left( {1 - p^{r}} \right)}{1 - p} \right)} - {\left( \frac{\overset{\_}{\mu}p}{1 - \mu} \right)\left( \frac{1 - \left( {\mu\; p} \right)^{r}}{1 - {\mu\; p}} \right)} + {\overset{\_}{\mu}{p^{r + 1}\left( \frac{1 - \mu^{r}}{1 - \mu} \right)}\left( \frac{1 - \left( {\mu\; p} \right)^{R - {2r}}}{1 - {\mu\; p}} \right)} + \frac{1 - \left( {\mu\; p} \right)^{R - r + 1}}{1 - {\mu\; p}} + {\left( \frac{\overset{\_}{\mu}\; p^{R - r + 1}}{1 - {\mu\; p}} \right){\left( {\frac{\mu^{R - {2r}}\left( {1 - \mu^{r}} \right)}{1 - \mu} - \frac{\mu^{R - r + 1}{p^{2}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)}}{1 - {\mu\; p^{2}}}} \right).}}}$

The proof of Lemma 16 is mostly computational and can be found in the appendix.

Expected size when 2r>R>r and R is even.

In this case, the arguments in the proof of Lemma 13 are still valid for self-avoiding paths of length n≤R−r rather than n≤r so we have the following estimate:

$\begin{matrix} {{\Gamma_{r}(n)} = {{1 + {\overset{\_}{\mu}\left( \frac{1 - \mu^{n - 1}}{1 - \mu} \right)} + {\mu^{n}\mspace{14mu}{for}\mspace{14mu}{all}\mspace{14mu} n}} \leq {R - {r.}}}} & (24) \end{matrix}$

Now, when R−r<n≤R+r, there is 1≤l≤r such that (a)n=R−r+2l−1 or (b)n=R−r+2l.

The proof of Lemma 15 easily extends to self-avoiding paths of length n>r, which corresponds to the condition l>r−R/2 when R is even. In particular,

$\begin{matrix} {{{\Gamma_{r}\left( {R - r + {2l} - 1} \right)} = {{\sum\limits_{k = 1}^{r}{\overset{\_}{\mu}\mu^{R - r + {2l} - k - 2}}} = {\overset{\_}{\mu}{\mu^{R - {2r} + {2l} - 2}\left( \frac{1 - \mu^{r - l + 1}}{1 - \mu} \right)}}}}\mspace{79mu}{{\Gamma_{r}\left( {R - r + {2l}} \right)} = {{\sum\limits_{k = 1}^{r}{\overset{\_}{\mu}\mu^{R - r + {2l} - k - 1}}} = {\overset{\_}{\mu}{\mu^{R - {2r} + {2l} - 1}\left( \frac{1 - \mu^{r - l + 1}}{1 - \mu} \right)}}}}} & (25) \end{matrix}$

for all r−R/2<l≤r. For intermediate self-avoiding paths of length R−r<n≤r, things are a little bit different. Following the proof of Lemma 14, the first l moves must be directed toward the root for the self-avoiding path to stay within distance R from the root. However, we now have n≤r so the number of moves k toward the root can be any l≤k≤n. In particular, we now have self-avoiding paths that only move toward the root (k=n). More precisely,

${\Gamma_{r,{R - r + {2l} - k - 1}}\left( {R - r + {2l} - 1} \right)} = \left\{ \begin{matrix} {\overset{\_}{\mu}\mu^{R - r + {2l} - k - 2}} & {{{for}\mspace{14mu} l} \leq k < {R - r + {2l} - 1}} \\ 1 & {{{for}\mspace{14mu} k} = {R - r + {2l} - 1.}} \end{matrix} \right.$

Similarly, we have

${\Gamma_{r,{R - r + {2l} - k}}\left( {R - r + {2l}} \right)} = \left\{ \begin{matrix} {\overset{\_}{\mu}\mu^{R - r + {2l} - k - 1}} & {{{for}\mspace{14mu} l} \leq k < {R - r + {2l}}} \\ 1 & {{{for}\mspace{14mu} k} = {R - r + {2{l.}}}} \end{matrix} \right.$

Summing over all possible values of k gives

$\begin{matrix} {{{\Gamma_{r}\left( {R - r + {2l} - 1} \right)} = {{1 + {\sum\limits_{k = l}^{R - r + {2l} - 2}{\overset{\_}{\mu}\mu^{R - r + {2l} - k - 2}}}} = {1 + {\overset{\_}{\mu}\left( \frac{1 - \mu^{R - r + {2l} - 1}}{1 - \mu} \right)}}}}{{\Gamma_{r}\left( {R - r + {2l}} \right)} = {{1 + {\sum\limits_{k = l}^{R - r + {2l} - 1}{\overset{\_}{\mu}\mu^{R - r + {2l} - k - 1}}}} = {1 + {\overset{\_}{\mu}\left( \frac{1 - \mu^{R - r + {2l}}}{1 - \mu} \right)}}}}} & (26) \end{matrix}$

For all 1≤l≤r−R/2. Combining Lemma 11 and (24)-(26), we obtain

${E_{r}(S)} = {1 + {\sum\limits_{n = 1}^{R - r}{\left( {1 + {\overset{\_}{\mu}\left( \frac{1 - \mu^{n - 1}}{1 - \mu} \right)} + \mu^{n}} \right)p^{n}}} + {\sum\limits_{l = 1}^{r - {R/2}}\left( {p^{R - r + {2l} - 1} + {\left( {1 + \overset{\_}{\mu}} \right)p^{R - r + {2l}}}} \right)} + {\sum\limits_{l = 1}^{r - {R/2}}{\sum\limits_{k = 1}^{R - r + {2l} - 2}{\overset{\_}{\mu}\left( {{\mu^{R - r + {2l} - k - 2}p^{R - r + {2l} - 1}} + {\mu^{R - r + {2l} - k - 1}p^{R - r + {2l}}}} \right)}}} + {\sum\limits_{l = {r - {R/2} + 1}}^{r}{\sum\limits_{k = 1}^{r}{\overset{\_}{\mu}\left( {{\mu^{R - r + {2l} - k - 2}p^{R - r + {2l} - 1}} + {\mu^{R - r + {2l} - k - 1}p^{R - r + {2l}}}} \right)}}}}$

which can be reduced to

$\begin{matrix} {{E_{r}(S)} = {{\sum\limits_{n = 1}^{R - r}{\left( {1 + {\overset{\_}{\mu}\left( \frac{1 - \mu^{n - 1}}{1 - \mu} \right)}} \right)p^{n}}} + {\sum\limits_{n = 0}^{R - r}\left( {\mu\; p} \right)^{n}} + {\sum\limits_{l = 1}^{r - {R/2}}{\left( {1 + {\left( {1 + \overset{\_}{\mu}} \right)p}} \right)p^{R - r + {2l} - 1}}} + {\sum\limits_{l = 1}^{r - {R/2}}{\sum\limits_{k = 1}^{R - r + {2l} - 2}{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}\mu^{R - r + {2l} - k - 2}p^{R - r + {2l} - 1}}}} + {\sum\limits_{l = {r - {R/2} + 1}}^{r}{\sum\limits_{k = l}^{r}\;{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}\mu^{R - r + {2l} - k - 2}p^{R - r + {2l} - 1}}}}}} & (27) \end{matrix}$

Computing each sum in (27) gives the following lemma.

Lemma 17—For all 0≤r≤R, the right-hand side of (23) is

${\phi + (r)} = {{\left( \frac{1 - \left( {\mu - \overset{\_}{\mu}} \right)}{1 - \mu} \right)\left( \frac{p\left( {1 - p^{R - r}} \right)}{1 - p} \right)} - {\left( \frac{\overset{\_}{\mu}p}{1 - \mu} \right)\left( \frac{1 - \left( {\mu\; p} \right)^{R - r}}{1 - {\mu\; p}} \right)} + \frac{1 - \left( {\mu\; p} \right)^{R - r + 1}}{1 - {\mu\; p}} + {\left( {1 + {\left( {1 + \overset{\_}{\mu}} \right)p}} \right){p^{R - r + 1}\left( \frac{1 - p^{{2r} - R}}{1 - \; p^{2}} \right)}} + {\left( \frac{\overset{\_}{\mu}\; p^{R - r + 1}}{1 - {\mu\; p}} \right)\left( {\frac{\mu^{R - {2r}}\left( {1 - \mu^{r}} \right)}{1 - \mu} - \frac{\mu^{R - r + 1}{p^{2}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)}}{1 - {\mu\; p^{2}}}} \right)} + {\left( \frac{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}p^{R - r + 1}}{1 - \mu} \right){\left( {\left( \frac{1 - p^{{2r} - R}}{1 - p^{2}} \right) - {\mu^{R - {2r}}\left( \frac{1 - \left( {\mu\; p} \right)^{{2r} - R}}{1 - \left( {\mu\; p} \right)^{2}} \right)}} \right).}}}$

The proof of Lemma 17 is mostly computational and can be found in the appendix.

Expected size when r∈{0, R/2, R} and R is even.

As explained at the beginning of this section, we now show that the two expressions ϕ_(±)(r) found respectively in Lemmas 16 and 17 extend to the boundary cases. According to Theorem 2,

${E_{0}(S)} = {\frac{1 - \left( {\mu\; p} \right)^{R + 1}}{1 - {\mu\; p}}.}$

Even though the proofs of Theorem 2 and (23) strongly differ, taking r=0 in the expression for the expected size in Lemma 16 gives (28) therefore E ₀(S)=ϕ_(0).

When the distance r=R/2, Lemmas 13 and 15 still hold, but because r=R−r, there is no paths of intermediate length. In particular, the second sum in (23) equals zero so the expected size of the cyber attack when the distance from the root is R/2 reduces to

$\begin{matrix} {{E_{R/2}(S)} = {{\sum\limits_{n = 1}^{R/2}{\left( {1 + {\overset{\_}{\mu}\left( \frac{1 - \mu^{n - 1}}{1 - \mu} \right)}} \right)p^{n}}} + {\sum\limits_{n = 0}^{R/2}\left( {\mu\; p} \right)^{n}} + {\sum\limits_{l = 1}^{R/2}{\sum\limits_{k = 1}^{R/2}{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}\mu^{{R/2} + {2l} - k - 2}{p^{{R/2} + {2l} - 1}.}}}}}} & (29) \end{matrix}$

Computing each sum in (29), we find that the expression in Lemma 16 is still valid for r=R/2. In other words, we have the following lemma whose proof is in the appendix.

Lemma 18—The right-hand side of (29) is equal to ϕ_(R/2).

Finally, when r=R, the condition n≤R−r in (24) is equivalent to n=0 so the equation is not valid because there is obviously only one path of length zero starting from a given vertex. In particular, the expected size given that r=R reduces to

$\begin{matrix} {{E_{R/2}(S)} = {1 + {\sum\limits_{l = 1}^{R/2}{\left( {1 + {\left( {1 + \overset{\_}{\mu}} \right)p}} \right)p^{{2l} - 1}}} + {\sum\limits_{l = 1}^{R/2}{\sum\limits_{k = 1}^{R/2}{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}^{n}\mu^{{2l} - k - 2}p^{{2l} - 1}}}} + {\sum\limits_{l = {{R/2} + 1}}^{R/2}{\sum\limits_{k = 1}^{R/2}{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}\mu^{{2l} - k - 2}{p^{{2l} - 1}.}}}}}} & (30) \end{matrix}$

As previously, computing each sum in (30), we find that the expression in Lemma 17 is still valid when the distance is r=R. In particular, we have the next lemma proved in the appendix.

Lemma 19—The right-hand side of (30) is equal to ϕ₊(R).

Combining (28) and Lemmas 18 and 19, we conclude that

$\begin{matrix} {{E_{r}(S)} = \left\{ {\begin{matrix} {\phi_{-}(r)} & {when} & {0 \leq {2r} \leq R} \\ {\phi_{+}(r)} & {when} & {{2r} > R \geq r} \end{matrix}.} \right.} & (31) \end{matrix}$

Simplifying ϕ_(r) and ϕ₊(r).

In view of (31), to complete the proof, the last step is to show that ϕ_(r) and ϕ₊(r) are both equal to the expression in the statement of the theorem. More precisely, we have the following lemma whose proof is purely computational and done in the appendix.

Lemma 20—For all 0≤r≤R,

${\phi_{-}(r)} = {{\phi_{+}(r)} = {\frac{1}{1 - {\mu\; p}}{\left( {1 + \frac{{p\left( {1 - p^{r}} \right)}\left( {1 - {\left( {\mu - \overset{\_}{\mu}} \right)p}} \right)}{1 - p} - \frac{\left( {\mu\; p} \right)^{R - r + 1}\left( {1 - {p^{2}\left( {\mu - {\overset{\_}{\mu}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)}} \right)}} \right)}{1 - {\mu\; p^{2}}}} \right).}}}$

Combining Lemma 20 and (31) gives

${E_{r}(S)} = {{\frac{1}{1 - {\mu\; p}}\left( {1 + \frac{{p\left( {1 - p^{r}} \right)}\left( {1 - {\left( {\mu - \overset{\_}{\mu}} \right)p}} \right)}{1 - p} - \frac{\left( {\mu\;{p^{R - r + 1}\left( {1 - {p^{2}\left( {\mu - {\overset{\_}{\mu}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)}} \right)}} \right)}} \right.}{1 - {\mu\; p^{2}}}} \right)\mspace{14mu}{for}\mspace{14mu}{all}\mspace{14mu} 0} \leq r \leq {R.}}$

In particular, Theorem 4 follows from the fact that E_(r)(C)=E_(r)(S)E(C₀).

Proof of Theorem 5

We prove that the upper bound holds for each realization G=(V, E) of the random graph. Equation (13) is satisfied for all possible starting point x∈V of the attack therefore Var_(r)(C)=E _(r)(S) Var(C ₀)+Var_(r)(S)(E(C ₀))²

where r=d(0, x). In particular, it suffices to prove that Var_(r)(S)≤(p ^(−2R)−1)(E _(r)(S))2.  (32)

Recalling the definition of we have

${{{Var}_{\;}}_{r}(S)} = {{E_{x}\left( \left( {\sum\limits_{y \in V}{\zeta(y)}} \right)^{2} \right)} - \left( {E_{x}\left( {\sum\limits_{y \in V}{\zeta(y)}} \right)} \right)^{2}}$ ${{{Var}_{\;}}_{r}(S)} = {{E_{x}\left( \left( {\sum\limits_{{\mathcal{y}} \in V}{\zeta({\mathcal{y}})}} \right)^{2} \right)} - \left( {E_{x}\left( {\sum\limits_{{\mathcal{y}} \in V}{\zeta({\mathcal{y}})}} \right)} \right)^{2}}$

then using that E_(x)(ζ(y)ζ(z))=P _(x)(ζ(y)=ζ(z)=1).

$\begin{matrix} {{{Var}_{r}(S)} = {{{\sum\limits_{y,{z \in V}}{P_{x}\left( {{\zeta(y)} = {{\zeta(z)} = 1}} \right)}} - {\sum\limits_{y,{z \in V}}{{P_{x}\left( {{\zeta(y)} = 1} \right)}{P_{x}\left( {{\zeta(z)} = 1} \right)}}}} = {\sum\limits_{y,{z \in V}}{\left( {{P_{x}\left( {{\zeta(y)} = {{\zeta(z)} = 1}} \right)} - {{P_{x}\left( {{\zeta(y)} = 1} \right)}{P_{x}\left( {{\zeta(z)} = 1} \right)}}} \right).}}}} & (33) \end{matrix}$

To simplify the notation, let q _(x)(y,z)=P _(x)(ζ(y)=ζ(z)=1)−P _(x)(ζ(y)=1)P _(x)(ζ(z)=1).

The probability that vertex y is being hacked is simply p at the power d(x, y), which is also the length of the unique self-avoiding path x→y. The probability that two given vertices y and z are being hacked is more complicated because the two self-avoiding paths x→y and x→z can overlap. The probability that both self-avoiding paths are open is p at the power d(x,y)+d(x,z)−r(x,y,z)

where r(x, y, z) is the number of edges the two self-avoiding paths x→y and x→z have in common, which is always at most the diameter of the graph therefore *

$\begin{matrix} {{q_{x}\left( {y,z} \right)} = {{{p^{d}\left( {x,y} \right)} + {d\left( {x,z} \right)} - {{r\left( {x,y,z} \right)}_{-}{p^{d}\left( {x,y} \right)}} + {d\left( {x,z} \right)}} = {{p^{{d{({x,y})}} + {d{({x,z})}}}\left( {{p^{- r}\left( {x,y,z} \right)} - 1} \right)} \leq {{p^{{d{({x,y})}} + {d{({x,z})}}}\left( {p^{{- 2}R} - 1} \right)}.}}}} & (34) \end{matrix}$

Combining (33) and (34), we conclude that

${{Var}_{r}(S)} = {{{\sum\limits_{y,{z \in V}}{q_{x}\left( {y,z} \right)}} \leq {\sum\limits_{y,{z \in V}}{p^{{d{({x,y})}} + {d{({x,z})}}}\left( {p^{{- 2}R} - 1} \right)}}} = {{\left( {p^{{- 2}R} - 1} \right){\sum\limits_{y,{z \in V}}{{P_{x}\left( {{\zeta(y)} = 1} \right)}{P_{x}\left( {{\zeta(z)} = 1} \right)}}}} = {\left( {p^{{- 2}R} - 1} \right){\left( {E_{r}(S)} \right)^{2}.}}}}$

This shows (32) and completes the proof. \

Proof of Corollary 6

The key to prove Corollary 6 is to condition on the distance D between the root and the starting point of the attack. For the deterministic tree with X=d,

$\begin{matrix} {{P\left( {D = r} \right)} = {\frac{d^{r}}{{card}\mspace{14mu}(V)} = {\left( \frac{d^{r}}{1 + d + \ldots + d^{R}} \right) = {{d^{r}\left( \frac{1 - d}{1 - d^{R + 1}} \right)}.}}}} & (35) \end{matrix}$

In addition, the conditional expectation given D is equal to

$\begin{matrix} {{E\left( {C❘D} \right)} = {{\sum\limits_{r = 0}^{R}{{E\left( {{C❘D} = r} \right)}1\left\{ {D = r} \right\}}} = {\sum\limits_{r = 0}^{R}{{E_{r}(C)}1{\left\{ {D = r} \right\}.}}}}} & (36) \end{matrix}$

Taking the expected value in (36), and using (35) and Theorem 4,

${E(C)} = {{E\left( {E\left( {C❘D} \right)} \right)} = {{\sum\limits_{r = 0}^{R}{{E_{r}(C)}{P\left( {D = r} \right)}}} = {\left( \frac{1 - d}{1 - d^{R + 1}} \right){\sum\limits_{r = 0}^{R}{d^{r}{E_{r}(S)}{E\left( C_{0} \right)}}}}}}$

which proves the first part of the corollary. Now, taking the variance in (36),

${{Var}\left( {E\left( {C❘D} \right)} \right)} = {{{Var}\left( {\sum\limits_{r = 0}^{R}{{E_{r}(C)}1\left\{ {D = r} \right\}}} \right)} = {{\sum\limits_{r,s}{{E_{r}(C)}{E_{s}(C)}{{cov}\left( {{1\left\{ {D = r} \right\}},{1\left\{ {D = s} \right\}}} \right)}}} = {\sum\limits_{r,s}{{E_{r}(C)}{E_{s}(C)}\left( {{P\left( {{D = r},{D = s}} \right)} - {{P\left( {D = r} \right)}{P\left( {D = s} \right)}}} \right)}}}}$

then distinguishing between r=s and r≠s, and using (35) and Theorem 4,

$\begin{matrix} {{{Var}\left( {E\left( {C❘D} \right)} \right)} = {{\sum\limits_{r = 0}^{R}{\left( \frac{d^{r}\left( {1 - d} \right)}{1 - d^{R + 1}} \right)\left( {1 - \frac{d^{r}\left( {1 - d} \right)}{1 - d^{R + 1}}} \right)\left( {{E_{r}(S)}{E\left( C_{0} \right)}} \right)^{2}}} - {\sum\limits_{r \neq s}{{d^{r + s}\left( \frac{1 - d}{1 - d^{R + 1}} \right)}^{2}{E_{r}(S)}{E_{s}(S)}{\left( {E\left( C_{0} \right)} \right)^{2}.}}}}} & (37) \end{matrix}$

In other respects, the conditional variance is

${{Var}\left( {C❘D} \right)} = {{\sum\limits_{r = 0}^{R}{{{Var}\left( {{C❘D} = r} \right)}1\left\{ {D = r} \right\}}} = {\sum\limits_{r = 0}^{R}{{{Var}_{r}(C)}1\left\{ {D = r} \right\}}}}$

Then taking the expected value and applying Theorem 5,

$\begin{matrix} {{E\left( {{Var}\left( {C\text{|}D} \right)} \right)} = {{\sum\limits_{r = 0}^{R}{{{Var}_{r}(C)}{P\left( {D = r} \right)}}} = {{\sum\limits_{r = 0}^{R}{{d^{r}\left( \frac{1 - d}{1 - d^{R + 1}} \right)}{{Var}_{r}(C)}}} \leq {\sum\limits_{r = 0}^{R}{{d^{r}\left( \frac{1 - d}{1 - d^{R + 1}} \right)}{\left( {{{E_{r}(S)}{{Var}\left( C_{0} \right)}} + {\left( {p^{{- 2}R} - 1} \right)\left( {{E_{r}(S)}{E\left( C_{0} \right)}} \right)^{2}}} \right).}}}}}} & (38) \end{matrix}$

The law of total variance implies that the unconditional variance of the cost is equal to the sum of the left-hand side of (37) and the left-hand side of (38). This give the bound

${{{Var}(C)} \leq {\left( \frac{1 - d}{1 - d^{R + 1}} \right)\left( {{\sum\limits_{r = 0}^{R}{d^{r}{E_{r}(S)}{{Var}\left( C_{0} \right)}}} + {\left( {{\sum\limits_{r = 0}^{R}{{d^{r}\left( {p^{{- 2}R} - 1} \right)}\left( {E_{r}(S)} \right)^{2}}} + {\sum\limits_{r = 0}^{R}{{d^{r}\left( {1 - \frac{d^{r}\left( {1 - d} \right)}{1 - d^{R + 1}}} \right)}\left( {E_{r}(S)} \right)^{2}}} - {\left( \frac{1 - d}{1 - d^{R + 1}} \right){\sum\limits_{r \neq s}{d^{r + s}{E_{r}(S)}}}}} \right)\left( {E\left( C_{0} \right)} \right)^{2}}} \right)}} = {\left( \frac{1 - d}{1 - d^{R + 1}} \right)\left( {{\sum\limits_{r = 0}^{R}{d^{r}{E_{r}(S)}{{Var}\left( C_{0} \right)}}} + {\left( {{\sum\limits_{r = 0}^{R}{d^{r}{p^{{- 2}R}\left( {E_{r}(S)} \right)}^{2}}} - {\left( \frac{1 - d}{1 - d^{R + 1}} \right){\sum\limits_{r,s}{d^{r + s}{E_{r}(S)}{E_{s}(S)}}}}} \right)\left( {E\left( C_{0} \right)} \right)^{2}}} \right)}$

which proves the second part of the corollary. Finally, the expression of the conditional expected value E_(r)(S) for the tree with X=d is easily obtained by setting

$\mu = {{E(X)} = {{d\mspace{14mu}{and}\mspace{14mu}\overset{\_}{\mu}} = {{\frac{d}{1 - 0} - 1} = {d - 1}}}}$

in the expression for the conditional expected value in Theorem 4.

Appendix A: Proof of Lemma 10

PROOF OF LEMMA 10. A direct calculation gives

$\begin{matrix} {S_{1} = {{\sum\limits_{n = 0}^{R}\left( {v^{n - 1}{\sum\limits_{k = 0}^{n - 1}v^{k}}} \right)} = {{\sum\limits_{n = 0}^{R}\left( {v^{n - 1}\left( \frac{1 - v^{n}}{1 - v} \right)} \right)} = {{{\frac{1}{1 - v}{\sum\limits_{n = 1}^{R}v^{n - 1}}} - v^{{2n} - 1}} = {{\frac{1}{1 - v}{\sum\limits_{n = 0}^{R - 1}\left( {v^{n} - {vv}^{2n}} \right)}} = {\left( \frac{1}{1 - v} \right){\left( {\frac{1 - v^{R}}{1 - v} - \frac{v\left( {1 - v^{2R}} \right)}{1 - v^{2}}} \right).}}}}}}} & (39) \end{matrix}$

Similarly, we have

$S_{2} = {{\sum\limits_{m = 1}^{R}{\sum\limits_{n = 0}^{m - 1}\left( {v^{m - 1}{\sum\limits_{k = 0}^{n - 1}v^{k}}} \right)}} = {{\sum\limits_{m = 1}^{R}{\sum\limits_{n = 0}^{m - 1}\left( {v^{m - 1}\left( \frac{1 - v^{n}}{1 - v} \right)} \right)}} = {{\frac{1}{1 - v}{\sum\limits_{m = 1}^{R}\left( {v^{m - 1}{\sum\limits_{n = 0}^{m - 1}\left( {1 - v^{n}} \right)}} \right)}} = {\frac{1}{1 - v}{\sum\limits_{m = 1}^{R}{{v^{m - 1}\left( {m - \frac{1 - v^{m}}{1 - v}} \right)}.}}}}}}$

Observing also that

${{\sum\limits_{m = 1}^{R}{mv}^{m - 1}} = {{\frac{\partial}{\partial v}\left( {\sum\limits_{m = 0}^{R}v^{m}} \right)} = {{\frac{\partial}{\partial v}\left( \frac{1 - v^{R + 1}}{1 - v} \right)} = \frac{1 - {v^{R}\left( {1 + {R\left( {1 - v} \right)}} \right)}}{\left( {1 - v} \right)^{2}}}}},$

we deduce that S₂ is given by

$\begin{matrix} {S_{2} = {{\frac{1}{1 - v}\left( {\frac{1 - {v^{R}\left( {1 + {R\left( {1 - v} \right)}} \right)}}{\left( {1 - v} \right)^{2}} - {\frac{1}{1 - v}{\sum\limits_{m = 0}^{R - 1}\left( {v^{m - 1} - {vv}^{2m}} \right)}}} \right)} = {{\frac{1}{1 - v}\left( {\frac{1 - {v^{R}\left( {1 + {R\left( {1 - v} \right)}} \right)}}{\left( {1 - v} \right)^{2}} - {\frac{1}{1 - v}\left( {\frac{1 - v^{R}}{1 - v} - \frac{v\left( {1 - v^{2R}} \right)}{1 - v^{2}}} \right)}} \right)} = {\frac{1}{\left( {1 - v} \right)^{2}}{\left( {{- {Rv}^{R}} + \frac{v\left( {1 - v^{2R}} \right)}{1 - v^{2}}} \right).}}}}} & (40) \end{matrix}$

Since Var₀(S)=Σ²(S₁+2S₂) according to Lemma 9, (39) and (40) give

${{Var}_{0}(S)} = {{{\frac{\Sigma^{2}}{1 - v}\left( {\frac{1 - v^{R}}{1 - v} - \frac{v\left( {1 - v^{2R}} \right)}{1 - v^{2}}} \right)} + {\frac{2\Sigma^{2}}{\left( {1 - v} \right)^{2}}\left( {{- {Rv}^{R}} + \frac{v\left( {1 - v^{2R}} \right)}{1 - v^{2}}} \right)}} = {{\frac{\Sigma^{2}}{1 - v}\left( {\frac{1 - {\left( {{2R} + 1} \right)v^{R}}}{1 - v} - {\left( {1 - \frac{2}{1 - v}} \right)\frac{v\left( {1 - v^{2R}} \right)}{1 - v^{2}}}} \right)} = {{\frac{\Sigma^{2}}{\left( {1 - v} \right)^{2}}\left( {1 - {\left( {{2R} + 1} \right)v^{R}} + \frac{v\left( {1 - v^{2R}} \right)}{1 - v}} \right)} = {\frac{\Sigma^{2}}{\left( {1 - v} \right)^{2}}{\left( {\frac{1 - v^{{2R} + 1}}{1 - v} - {\left( {{2R} + 1} \right)v^{R}}} \right).}}}}}$

This completes the proof.

Appendix B: Proof of Lemmas 16-20

The appendix gives the proofs of all the lemmas that rely on basic algebra and do not involve any probabilistic or combinatorial reasoning.

PROOF OF LEMMA 16. The first sum in (23) can be written as

$\begin{matrix} {{S_{1}^{-}(r)} = {{\sum\limits_{n = 1}^{r}{\left( {1 + {\overset{\_}{\mu}\left( \frac{1 - \mu^{n - 1}}{1 - \mu} \right)}} \right)p^{n}}} = {{{\sum\limits_{n = 1}^{r}{\left( {1 + \frac{\overset{\_}{\mu}}{1 - \mu}} \right)p^{n}}} - {\sum\limits_{n = 1}^{r}{\left( \frac{\overset{\_}{\mu}p}{1 - \mu} \right)\left( {\mu\; p} \right)^{n - 1}}}} = {{\left( \frac{1 - \left( {\mu - \overset{\_}{\mu}} \right)}{1 - \mu} \right)\left( \frac{p\left( {1 - p^{r}} \right)}{1 - p} \right)} - {\left( \frac{\overset{\_}{\mu}p}{1 - \mu} \right)\left( \frac{1 - \left( {\mu\; p} \right)^{r}}{1 - {\mu\; p}} \right)}}}}} & (41) \end{matrix}$

The second sum in (23) is

$\begin{matrix} {{S_{2}^{-}(r)} = {{\sum\limits_{n = {r + 1}}^{R - r}{\left( {\overset{\_}{\mu}{\mu^{n - r - 1}\left( \frac{1 - \mu^{r}}{1 - \mu} \right)}} \right)p^{n}}} = {{\overset{\_}{\mu}{p^{r + 1}\left( \frac{1 - \mu^{r}}{1 - \mu} \right)}{\sum\limits_{n = {r + 1}}^{R - r}\left( {\mu\; p} \right)^{n - r - 1}}} = {\overset{\_}{\mu}{p^{r + 1}\left( \frac{1 - \mu^{r}}{1 - \mu} \right)}\left( \frac{1 - \left( {\mu\; p} \right)^{R - {2r}}}{1 - {\mu\; p}} \right)}}}} & (42) \end{matrix}$

while the third sum in (23) is

$\begin{matrix} {{S_{3}^{-}(r)} = {{\sum\limits_{n = 0}^{R - r}\left( {\mu\; p} \right)^{R}} = {\frac{1 - \left( {\mu\; p} \right)^{R - r + 1}}{1 - {\mu\; p}}.}}} & (43) \end{matrix}$

To compute the double sum in (23), we first sum over l to get

${S_{4}^{-}(r)} = {{\sum\limits_{k = 1}^{r}{\sum\limits_{l = 1}^{k}{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}\mu^{R - r + {2l} - k - 2}p^{R - r + {2l} - 1}}}} = {{\sum\limits_{k = 1}^{r}{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}\mu^{R - r - k}p^{R - r + 1}{\sum\limits_{l = 0}^{k - 1}\left( {\mu\; p} \right)^{2l}}}} = {\sum\limits_{k = 1}^{r}{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}\mu^{R - r - k}{{p^{R - r + 1}\left( \frac{1 - \left( {\mu\; p} \right)^{2k}}{1 - \left( {\mu\; p} \right)^{2}} \right)}.}}}}}$

Then, using 1−(μp)²=(1−μp)(1+μp) and summing over k,

$\begin{matrix} {{S_{4}^{-}(r)} = {{\left( \frac{\overset{\_}{\mu}p^{R - r + 1}}{1 - {\mu\; p}} \right)\left( {{\mu^{R - {2r}}{\sum\limits_{k = 0}^{r - 1}\mu^{k}}} - {\mu^{R - r + 1}p^{2}{\sum\limits_{k = 0}^{r - 1}\left( {\mu\; p^{2}} \right)^{k}}}} \right)} = {\left( \frac{\overset{\_}{\mu}p^{R - r + 1}}{1 - {\mu\; p}} \right){\left( {\frac{\mu^{R - {2r}}\left( {1 - \mu^{r}} \right)}{1 - \mu} - \frac{\mu^{R - r + 1}{p^{2}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)}}{1 - {\mu\; p^{2}}}} \right).}}}} & (44) \end{matrix}$

Combining (23)-(44) completes the proof. □

PROOF OF LEMMA 17. Following the proof of Lemma 16, the first sum in (27) is

$\begin{matrix} {{S_{1}^{+}(r)} = {{\left( \frac{1 - \left( {\mu - \overset{\_}{\mu}} \right)}{1 - \mu} \right)\left( \frac{p\left( {1 - p^{R - r}} \right)}{1 - p} \right)} - {\left( \frac{\overset{\_}{\mu}p}{1 - \mu} \right){\left( \frac{1 - \left( {\mu\; p} \right)^{R - r}}{1 - {\mu\; p}} \right).}}}} & (45) \end{matrix}$

The second sum in (27) is

$\begin{matrix} {{S_{2}^{+}(r)} = {{\sum\limits_{n = 0}^{R - r}\left( {\mu\; p} \right)^{n}} = \frac{1 - \left( {\mu\; p} \right)^{R - r + 1}}{1 - {\mu\; p}}}} & (46) \end{matrix}$

while the third sum in (27) is

$\begin{matrix} {{S_{3}^{+}(r)} = {{\sum\limits_{i = 1}^{r - {R/2}}{\left( {1 + {\left( {1 + \overset{\_}{\mu}} \right)p}} \right)p^{R - r + {2l} - 1}}} = {{\left( {1 + {\left( {1 + \overset{\_}{\mu}} \right)p}} \right)p^{R - r + 1}{\sum\limits_{i = 0}^{r - {R/2} - 1}p^{2l}}} = {\left( {1 + {\left( {1 + \overset{\_}{\mu}} \right)p}} \right){{p^{R - r + 1}\left( \frac{1 - p^{{2r} - R}}{1 - p^{2}} \right)}.}}}}} & (47) \end{matrix}$

Letting the common summand in the two double sums in (27) be a(R,r,l,k)=(1+μ)μ^(R−r+2l−k−2) p ^(R−r+2l−1)

and adding these two double sums give

${{\sum\limits_{l = 1}^{r - {R/2}}{\sum\limits_{k = l}^{R - r + {2l} - 2}{a\left( {R,r,l,k} \right)}}} + {\sum\limits_{l = {r - {R/2} + 1}}^{r}{\sum\limits_{k = 1}^{r}{a\left( {R,r,l,k} \right)}}}} = {{{\sum\limits_{l = 1}^{r - {R/2}}\left( {{\sum\limits_{k = l}^{r}{a\left( {R,r,l,k} \right)}} - {\sum\limits_{k = {R - r + {2l} - 1}}^{r}{a\left( {R,r,l,k} \right)}}} \right)} + {\sum\limits_{l = {r - {R/2} + 1}}^{r}{\sum\limits_{k = l}^{r}{a\left( {R,r,l,k} \right)}}}} = {{{\sum\limits_{l = 1}^{r}{\sum\limits_{k = l}^{r}{a\left( {R,r,l,k} \right)}}} - {\sum\limits_{l = 1}^{r - {R/2}}{\sum\limits_{k = {R - r + {2l} - 1}}^{r}{a\left( {R,r,l,k} \right)}}}} = {{S_{4}^{+}(r)} - {{S_{5}^{+}(r)}.}}}}$

As in the proof of Lemma 16,

$\begin{matrix} {{S_{4}^{+}(r)} = {\left( \frac{\overset{\_}{\mu}p^{R - r + 1}}{1 - {\mu\; p}} \right){\left( {\frac{\mu^{R - {2r}}\left( {1 - \mu^{r}} \right)}{1 - \mu} - \frac{\mu^{R - r + 1}{p^{2}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)}}{1 - {\mu\; p^{2}}}} \right).}}} & (48) \end{matrix}$

To compute S_(X) ⁺(r), we first observe that

${\sum\limits_{k = {R - r + {2l} - 1}}^{r}\mu^{R - r + {2l} - k - 2}} = {{\mu^{R - {2{r\_}2l} - 2}{\sum\limits_{k = 0}^{{2r} - R - {2l} + 1}\mu^{k}}} = {{\mu^{R - {2r} + {2l} - 2}\left( \frac{1 - \mu^{{2r} - R - {2l} + 2}}{1 - \mu} \right)} = {- {\left( \frac{1 - \mu^{R - {2r} + {2l} - 2}}{1 - \mu} \right).}}}}$

It follows that

$\begin{matrix} {{S_{5}^{+}(r)} = {{- {\sum\limits_{l = 1}^{r - {R/2}}{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}{p^{R - r + {2l} - 1}\left( \frac{1 - \mu^{R - {2r} + {2l} - 2}}{1 - \mu} \right)}}}} = {{{- \left( \frac{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}p^{R - r + 1}}{1 - \mu} \right)}\left( {{\sum\limits_{l = 0}^{r - {R/2} - 1}p^{2l}} - {\mu^{R - {2r}}{\sum\limits_{l = 0}^{r - {R/2} - 1}\left( {\mu\; p} \right)^{2l}}}} \right)} = {{- \left( \frac{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}p^{R - r + 1}}{1 - \mu} \right)}{\left( {\left( \frac{1 - p^{{2r} - R}}{1 - p^{2}} \right) - {\mu^{R - {2r}}\left( \frac{1 - \left( {\mu\; p} \right)^{{2r} - R}}{1 - \left( {\mu\; p} \right)^{2}} \right)}} \right).}}}}} & (49) \end{matrix}$

Combining (27)-(49) completes the proof. □

PROOF OF LEMMA 18. First, we observe that the right-hand side of (29) is s ₁ ⁻(R/2)+S ₃ ⁻(R/2)+S ₄ ⁻(R/2). s ₁ ⁻(R/2)+S ₃ ⁻(R/2)+S ₄ ⁻(R/2).

In particular, using (41), (43) and (44), and setting r=R/2, we get

${E_{R/2}(S)} = {{\left( \frac{1 - \left( {\mu - \overset{\_}{\mu}} \right)}{1 - \mu} \right)\left( \frac{p\left( {1 - p^{R/2}} \right)}{1 - p} \right)} - {\left( \frac{\overset{\_}{\mu}p}{1 - \mu} \right)\left( \frac{1 - \left( {\mu\; p} \right)^{R/2}}{1 - {\mu\; p}} \right)} + \frac{1 - \left( {\mu\; p} \right)^{{R/2} + 1}}{1 - {\mu\; p}} + {\left( \frac{\overset{\_}{\mu}p^{{R/2} + 1}}{1 - {\mu\; p}} \right){\left( {\frac{1 - \mu^{R/2}}{1 - \mu} - \frac{\mu^{{R/2} + 1}{p^{2}\left( {1 - \left( {\mu\; p^{2}} \right)^{R/2}} \right)}}{1 - {\mu\; p^{2}}}} \right).}}}$

It is straightforward to check that this is equal to ϕ_(R/2). □

PROOF OF LEMMA 19. This is similar to the proof of Lemma 18. Referring to the notations introduced in the proof of Lemma 17, we observe that the right-hand side of (30) is 1+S ₃ ⁺(R)+S _(e4) ⁺(R)−S ₅ ⁺(R).

Then, using (47)-(49), and setting r=R, we get

${E_{R}(S)} = {1 + {\left( {1 + {\left( {1 + \overset{\_}{\mu}} \right)p}} \right){p\left( \frac{1 - p^{R}}{1 - p^{2}} \right)}} + {\left( \frac{\overset{\_}{\mu}p}{1 - {\mu\; p}} \right)\left( {\frac{\mu^{- R}\left( {1 - \mu^{R}} \right)}{1 - \mu} - \frac{\mu\;{p^{2}\left( {1 - \left( {\mu\; p^{2}} \right)^{R}} \right)}}{1 - {\mu\; p^{2}}}} \right)} + {\left( \frac{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}p}{1 - \mu} \right){\left( {\left( \frac{1 - p^{R}}{1 - p^{2}} \right) - {\mu^{- R}\left( \frac{1 - \left( {\mu\; p} \right)^{R}}{1 - \left( {\mu\; p} \right)^{2}} \right)}} \right).}}}$

It is straightforward to check that this is equal to ϕ₊(R).

PROOF OF LEMMA 20. Referring to the proof of Lemma 16, we write

S₁⁻(r) = σ₁⁺(r) − σ₁⁻(r) S₂⁻(r) = (1 − (μ p)^(R − 2r))σ₂(r) S₃⁻(r) = (1 − (μ p)^(R − r_1))σ₃(r)

where σ₁ ⁺(r), σ₂(r) and σ₃(r) are given by

$\begin{matrix} {{\sigma_{1}^{+}(r)} = {\left( \frac{1 - \left( {\mu - \overset{\_}{\mu}} \right)}{1 - \mu} \right)\left( \frac{p\left( {1 - p^{r}} \right)}{1 - p} \right)}} & {{\sigma_{1}^{-}(r)} = {\left( \frac{\overset{\_}{\mu}p}{1 - \mu} \right)\left( \frac{1 - \left( {\mu\; p} \right)^{r}}{1 - {\mu\; p}} \right)}} \\ {{\sigma_{2}(r)} = {\overset{\_}{\mu}{p^{r + 1}\left( \frac{1 - \mu^{r}}{1 - \mu} \right)}\left( \frac{1}{1 - {\mu\; p}} \right)}} & {{\sigma_{3}(r)} = {\left( \frac{1}{1 - {\mu\; p}} \right).}} \end{matrix}$

The numerator of σ₂(r) minus the numerator of σ₁ ⁻(r) reduces to

${{\overset{\_}{\mu}\;{p^{r + 1}\left( {1 - \mu^{r}} \right)}} - {\overset{\_}{\mu}\;{p\left( {1 - \left( {\mu\; p} \right)^{r}} \right)}}} = {\overset{\_}{\mu}{p\left( {p^{r} - 1} \right)}}$

from which it follows that

${{\sigma_{1}^{+}(r)} - {\sigma_{1}^{-}(r)} + {\sigma_{2}(r)}} = {{{\left( \frac{1 - \left( {\mu - \overset{\_}{\mu}} \right)}{1 - \mu} \right)\frac{p\left( {1 - p^{r}} \right)}{1 - p}} + \frac{\overset{\_}{\mu}{p\left( {p^{r} - 1} \right)}}{\left( {1 - \mu} \right)\left( {1 - {\mu\; p}} \right)}} = {{\frac{p\left( {1 - p^{r}} \right)}{1 - \mu}\left( {\frac{1 - \left( {\mu - \overset{\_}{\mu}} \right)}{1 - p} - \frac{\overset{\_}{\mu}}{1 - {\mu\; p}}} \right)} = {\frac{{p\left( {1 - p^{r}} \right)}\left( {1 - \mu} \right)\left( {1 - {\left( {\mu - \overset{\_}{\mu}} \right)p}} \right)}{\left( {1 - \mu} \right)\left( {1 - p} \right)\left( {1 - {\mu\; p}} \right)} = {\frac{{p\left( {1 - p^{r}} \right)}\left( {1 - {\left( {\mu - \overset{\_}{\mu}} \right)p}} \right)}{\left( {1 - p} \right)\left( {1 - {\mu\; p}} \right)}.}}}}$

Adding also σ₃(r), we obtain

$\begin{matrix} {{{S_{1}^{-}(r)} + {\sigma_{2}(r)} + {\sigma_{3}(r)}} = {\frac{1}{1 - {\mu\; p}}{\left( {1 + \frac{{p\left( {1 - p^{r}} \right)}\left( {1 - {\left( {\mu - \overset{\_}{\mu}} \right)p}} \right)}{1 - p}} \right).}}} & (50) \end{matrix}$

Now, we write S₄ ⁻(r)=σ₄ ⁺(r)−σ₄ ⁻(r) where

${\sigma_{4}^{+}(r)} = {\left( \frac{\overset{\_}{\mu}p^{R - r + 1}}{1 - {\mu\; p}} \right)\left( \frac{\mu^{R - {2r}}\left( {1 - \mu^{r}} \right)}{1 - \mu} \right)}$ ${{\sigma_{4}^{-}(r)} = {\left( \frac{\overset{\_}{\mu}p^{R - r + 1}}{1 - {\mu\; p}} \right)\left( \frac{\mu^{R - r + 1}{p^{2}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)}}{1 - \mu} \right)}},$

and observe that

$\begin{matrix} {{{\sigma_{4}^{+}(r)} - {\left( {\mu\; p} \right)^{R - {2r}}{\sigma_{2}(r)}}} = {{{\left( \frac{\overset{\_}{\mu}p^{R - r + 1}}{1 - {\mu\; p}} \right)\frac{\mu^{R - {2r}}\left( {1 - \mu^{r}} \right)}{1 - \mu}} - {\overset{\_}{\mu}{p^{r + 1}\left( \frac{1 - \mu^{r}}{1 - \mu} \right)}\frac{\left( {\mu\; p} \right)^{R - {2r}}}{1 - {\mu\; p}}}} = {{\frac{\overset{\_}{\mu}\left( {1 - \mu^{r}} \right)}{\left( {1 - {\mu\; p}} \right)\left( {1 - \mu} \right)}\left( {{p^{R - r + 1}\mu^{R - {2r}}} - {p^{r + 1}\left( {\mu\; p} \right)}^{R - {2r}}} \right)} = 0}}} & (51) \end{matrix}$

Note also that

$\begin{matrix} {{{\sigma_{4}^{-}(r)} + {\left( {\mu\; p} \right)^{R - r + 1}{\sigma_{3}(r)}}} = {{{\left( \frac{\overset{\_}{\mu}p^{R - r + 1}}{1 - {\mu\; p}} \right)\left( \frac{\mu^{R - r + 1}{p^{2}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)}}{1 - {\mu\; p^{2}}} \right)} + \frac{\left( {\mu\; p} \right)^{R - r + 1}}{1 - {\mu\; p}}} = {{\frac{\left( {\mu\; p} \right)^{R - r + 1}}{\left( {1 - {\mu\; p}} \right)\left( {1 - {\mu\; p^{2}}} \right)}\left( {{\overset{\_}{\mu}{p^{2}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)}} + \left( {1 - {\mu\; p^{2}}} \right)} \right)} = {\frac{\left( {\mu\; p} \right)^{R - r + 1}\left( {1 - {p^{2}\left( {\mu - {\overset{\_}{\mu}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)}} \right)}} \right)}{\left( {1 - {\mu\; p}} \right)\left( {1 - {\mu\; p^{2}}} \right)}.}}}} & (52) \end{matrix}$

Combining (50)-(52), we deduce that

${{\phi\_}(r)} = {\frac{1}{1 - {\mu\; p}}{\left( {1 + \frac{{p\left( {1 - p^{r}} \right)}\left( {1 - {\left( {\mu - \overset{\_}{\mu}} \right)p}} \right)}{1 - p} - \frac{\left( {\mu\; p} \right)^{R - r + 1}\left( {1 - {p^{2}\left( {\mu - {\overset{\_}{\mu}\left( {1 - \left( {\mu\; p^{2}} \right)^{r}} \right)}} \right)}} \right)}{1 - {\mu\; p^{2}}}} \right).}}$

To complete the proof of the lemma, it suffices to show that ϕ₊(r)−ϕ_(r)=0. Referring to the proofs of Lemmas 16 and 17, we have S ₃ ⁻(r)=S ₂ ⁺(r) and S ₄ ⁻(r)=S ₄ ⁺(r)

from which it follows that

${{\phi_{+}(r)} - {{\phi\_}(r)}} = {{\left( \frac{1 - \left( {\mu - \overset{\_}{\mu}} \right)}{1 - \mu} \right)\left( \frac{p\left( {p^{r} - p^{R - r}} \right)}{1 - p} \right)} - {\left( \frac{\overset{\_}{\mu}p}{1 - \mu} \right)\left( \frac{\left( {\mu\; p} \right)^{r} - \left( {\mu\; p} \right)^{R - r}}{1 - {\mu\; p}} \right)} - {\overset{\_}{\mu}{p^{r + 1}\left( \frac{1 - \mu^{r}}{1 - \mu} \right)}\left( \frac{1 - \left( {\mu\; p} \right)^{R - {2r}}}{1 - {\mu\; p}} \right)} + {\left( {1 + {\left( {1 + \overset{\_}{\mu}} \right)p}} \right){p^{R - r + 1}\left( \frac{1 - p^{{2r} - R}}{1 - p^{2}} \right)}} + {\left( \frac{{\overset{\_}{\mu}\left( {1 + {\mu\; p}} \right)}p^{R - r + 1}}{1 - \mu} \right){\left( {\left( \frac{1 - p^{{2r} - R}}{1 - p^{2}} \right) - {\mu^{R - {2r}}\left( \frac{1 - \left( {\mu\; p} \right)^{{2r} - R}}{1 - \left( {\mu\; p} \right)^{2}} \right)}} \right).}}}$

Multiplying by (1−μ)(1−p²)(1−(μp)²), we get (1−μ)(1−p ²)(1−(μp)²)(ϕ₊(r)−ϕ_(r))=(1+p)(1−(μp)²)(1−(μ−μ))p(p ^(r) −p ^(R−r)−()1−p ²(1+μp)[(μp)^(r)−(μp)^(R−r) +p ^(r)(1−μ^(r))(1−(μp)^(R−2r))]μ p+(1−μ)(1−(μp)²)(1+(1+μ)p)p ^(R−r+1)(1−p ^(2r−r))+μ(1+μp)p ^(R−r+1)[(1−(μp)²)(1−p ^(2r−R))−1−p ²)μ^(R−2r)(1−(μp)^(2r−R))].

Expanding and simplifying shows that the right-hand side equals zero. □

Exemplary Computing Device Configured for Percolation Model

Referring to FIG. 5, a computing device 500 may be used to implement various aspects of the model 101 described herein. More particularly, in some embodiments, aspects of the model 101 may be translated to software or machine-level code, which may be installed to and/or executed by the computing device 500 such that the computing device 500 is configured to model or otherwise assess loss distribution for cyber risk of small or medium-sized enterprises with tree based LAN topology as described herein. It is contemplated that the computing device 500 may include any number of devices, such as personal computers, server computers, hand-held or laptop devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronic devices, network PCs, minicomputers, mainframe computers, digital signal processors, state machines, logic circuitries, distributed computing environments, and the like.

The computing device 500 may include various hardware components, such as a processor 502, a main memory 504 (e.g., a system memory), and a system bus 501 that couples various components of the computing device 500 to the processor 502. The system bus 501 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. For example, such architectures may include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as Mezzanine bus.

The computing device 500 may further include a variety of memory devices and computer-readable media 507 that includes removable/non-removable media and volatile/nonvolatile media and/or tangible media, but excludes transitory propagated signals. Computer-readable media 507 may also include computer storage media and communication media. Computer storage media includes removable/non-removable media and volatile/nonvolatile media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules or other data, such as RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that may be used to store the desired information/data and which may be accessed by the computing device 500. Communication media includes computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. For example, communication media may include wired media such as a wired network or direct-wired connection and wireless media such as acoustic, RF, infrared, and/or other wireless media, or some combination thereof. Computer-readable media may be embodied as a computer program product, such as software stored on computer storage media.

The main memory 504 includes computer storage media in the form of volatile/nonvolatile memory such as read only memory (ROM) and random access memory (RAM). A basic input/output system (BIOS), containing the basic routines that help to transfer information between elements within the computing device 500 (e.g., during start-up) is typically stored in ROM. RAM typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processor 502. Further, data storage 506 in the form of Read-Only Memory (ROM) or otherwise may store an operating system, application programs, and other program modules and program data.

The data storage 506 may also include other removable/non-removable, volatile/nonvolatile computer storage media. For example, the data storage 506 may be: a hard disk drive that reads from or writes to non-removable, nonvolatile magnetic media; a magnetic disk drive that reads from or writes to a removable, nonvolatile magnetic disk; a solid state drive; and/or an optical disk drive that reads from or writes to a removable, nonvolatile optical disk such as a CD-ROM or other optical media. Other removable/non-removable, volatile/nonvolatile computer storage media may include magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The drives and their associated computer storage media provide storage of computer-readable instructions, data structures, program modules, and other data for the computing device 500.

A user may enter commands and information through a user interface 540 (displayed via a monitor 560) by engaging input devices 545 such as a tablet, electronic digitizer, a microphone, keyboard, and/or pointing device, commonly referred to as mouse, trackball or touch pad. Other input devices 545 may include a joystick, game pad, satellite dish, scanner, or the like. Additionally, voice inputs, gesture inputs (e.g., via hands or fingers), or other natural user input methods may also be used with the appropriate input devices, such as a microphone, camera, tablet, touch pad, glove, or other sensor. These and other input devices 545 are in operative connection to the processor 502 and may be coupled to the system bus 501, but may be connected by other interface and bus structures, such as a parallel port, game port or a universal serial bus (USB). A monitor 560 or other type of display device may also be connected to the system bus 501. The monitor 560 may also be integrated with a touch-screen panel or the like.

The computing device 500 may be implemented in a networked or cloud-computing environment using logical connections of a network interface 503 to one or more remote devices, such as a remote computer. The remote computer may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computing device 500. The logical connection may include one or more local area networks (LAN) and one or more wide area networks (WAN), but may also include other networks. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet.

When used in a networked or cloud-computing environment, the computing device 500 may be connected to a public and/or private network through the network interface 503. In such embodiments, a modem or other means for establishing communications over the network is connected to the system bus 501 via the network interface 503 or other appropriate mechanism. A wireless networking component including an interface and antenna may be coupled through a suitable device such as an access point or peer computer to a network. In a networked environment, program modules depicted relative to the computing device 500, or portions thereof, may be stored in the remote memory storage device.

Certain embodiments are described herein as including one or more modules. Such modules are hardware-implemented, and thus include at least one tangible unit capable of performing certain operations and may be configured or arranged in a certain manner. For example, a hardware-implemented module may comprise dedicated circuitry that is permanently configured (e.g., as a special-purpose processor, such as a field-programmable gate array (FPGA) or an application-specific integrated circuit (ASIC)) to perform certain operations. A hardware-implemented module may also comprise programmable circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software or firmware to perform certain operations. In some example embodiments, one or more computer systems (e.g., a standalone system, a client and/or server computer system, or a peer-to-peer computer system) or one or more processors may be configured by software (e.g., an application or application portion) as a hardware-implemented module that operates to perform certain operations as described herein.

Accordingly, the term “hardware-implemented module” encompasses a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner and/or to perform certain operations described herein. Considering embodiments in which hardware-implemented modules are temporarily configured (e.g., programmed), each of the hardware-implemented modules need not be configured or instantiated at any one instance in time. For example, where the hardware-implemented modules comprise a general-purpose processor configured using software, the general-purpose processor may be configured as respective different hardware-implemented modules at different times. Software may accordingly configure the processor 502, for example, to constitute a particular hardware-implemented module at one instance of time and to constitute a different hardware-implemented module at a different instance of time.

Hardware-implemented modules may provide information to, and/or receive information from, other hardware-implemented modules. Accordingly, the described hardware-implemented modules may be regarded as being communicatively coupled. Where multiple of such hardware-implemented modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the hardware-implemented modules. In embodiments in which multiple hardware-implemented modules are configured or instantiated at different times, communications between such hardware-implemented modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware-implemented modules have access. For example, one hardware-implemented module may perform an operation, and may store the output of that operation in a memory device to which it is communicatively coupled. A further hardware-implemented module may then, at a later time, access the memory device to retrieve and process the stored output. Hardware-implemented modules may also initiate communications with input or output devices.

It should be understood from the foregoing that, while particular embodiments have been illustrated and described, various modifications can be made thereto without departing from the spirit and scope of the invention as will be apparent to those skilled in the art. Such changes and modifications are within the scope and teachings of this invention as defined in the claims appended hereto. 

What is claimed is:
 1. A method for simulating losses due to cyberattacks comprising: providing a processor operable for executing instructions including: (1) generating a rooted random tree, including specifying a number of random trees to be constructed, specifying the values of parameters that characterize each of the plurality of simulations concerning random trees, applying a command for sampling from a multinomial distribution, based on the above parameters, to generate a plurality of offsprings starting from the root of the random tree in a particular simulation, applying a loop that iterates through levels of the random tree in the particular simulation, and as tree grows, labeling every vertex in the random tree by a unique integer in ordered manner; (2) generating a contagion process of the edges of the random tree, including specifying the values of parameters that characterize randomness related to contagions of edges on a particular random tree, generating repeated samples from a Bernoulli distribution in a separate column array which has the same height as the matrix that stores the edges of the particular random tree, selecting only columns of the matrix which contain successful Bernoulli trials, storing those columns in a separate matrix, and considering that edges associated with the successful Bernoulli trials represent possible channels of contagion process in a random tree; (3) selecting a node in the random tree that will be infected and represent a starting node of the contagion process; (4) identifying nodes that can be reached by the contagion process; (5) generating losses on each of the nodes and aggregating the losses; and (6) repeating steps (1)-(5) to generate new total losses originating from a cyber-attack.
 2. The method of claim 1, wherein choosing the node that will be infected and be the starting point of the contagion comprises: finding the maximum integer in the matrix that stores the edges of the particular random tree; and choosing an integer uniformly at random between one and the maximum above, in order to determine, in a particular tree, the infected node from which contagion process starts.
 3. The method of claim 1, wherein identifying the nodes that can be reached by the contagion process comprises using the matrix of edges associated to successful Bernoulli trials in order to identify the set of nodes that can be reached from the starting node by the contagion process.
 4. The method of claim 1, wherein generating losses comprises: specifying the probability distribution and values of parameters that characterize each of the plurality of losses associated to the nodes reached by a particular contagion process; and given the above specification, generating and summing as many losses as the number of nodes that can be reached by a particular contagion process which is considered to be a total loss caused by one cyber-attack.
 5. An apparatus for modeling loss distribution for cyber risk, comprising: a processor for implementing a structural percolation model that considers a physical layer of a network and assumes a tree network topology to model loss, and is configured to: (1) generate a rooted random tree; (2) generate a contagion process of the edges of the random tree, wherein the processor specifies the values of parameters that characterize randomness related to contagions of edges on a particular random tree, generates repeated samples from a Bernoulli distribution in a separate column array which has the same height as the matrix that stores the edges of the particular random tree, selects only columns of the matrix which contain successful Bernoulli trials, stores the only columns in a separate matrix, and considers that edges associated with the successful Bernoulli trials represent possible channels of contagion process in a random tree; (3) select a node in the random tree that will be infected and represent a starting node of the contagion process; (4) identify nodes that can be reached by the contagion process; (5) generate losses on each of the nodes and aggregating the losses; and (6) repeat steps (1)-(5) to generate new total losses originating from a cyber-attack.
 6. The apparatus of claim 5, wherein to generate the random tree the processor is further configured to: specify a number of random trees to be constructed; specify the values of parameters that characterize each of the plurality of simulations concerning random trees; apply a command for sampling from a multinomial distribution, based on the above parameters, to generate a plurality of offsprings starting from the root of the random tree in a particular simulation; apply a loop that iterates through levels of the random tree in the particular simulation; and as tree grows, label every vertex in the random tree by a unique integer in ordered manner.
 7. The apparatus of claim 5, wherein to choose the node that will be infected and be the starting point of the contagion comprises: finding the maximum integer in the matrix that stores the edges of the particular random tree; and choosing an integer uniformly at random between one and the maximum above, in order to determine, in a particular tree, the infected node from which contagion process starts.
 8. A computer implemented method for modeling loss distribution for cyber risk, comprising: implementing a structural percolation model that considers a physical layer of a network and assumes a tree network topology to model loss, including: generating a random tree graph associated with a network topology, including specifying a number of random trees to be constructed; specifying the values of parameters that characterize each of the plurality of simulations concerning random trees; applying a command for sampling from a multinomial distribution, based on the above parameters, to generate a plurality of offsprings starting from the root of the random tree in a particular simulation; applying a loop that iterates through levels of the random tree in the particular simulation; and as tree grows, labeling every vertex in the random tree by a unique integer in ordered manner; assigning a data asset to each node of the random tree graph related to a corresponding node of a plurality of network topology nodes associated with the network topology; representing the data asset as a random variable; defining a contagion process stemming from an event of a breach of a node of the plurality of network topology nodes given a particular temporal instance of the network topology; and summing all losses, given a particular node of the plurality of network topology nodes where an infection starts and a realization of the contagion process to characterize one observation point in aggregate loss distribution due to breach. 